How to filter data from OData query on Azure Data Factory?

I need create a query inside the OData source connection in Azure Data Factory to Invantive Cloud.

For example: My database name is Simplicate, alias is spe and column to filter on is a date named created_at. Setup as on Connect Azure Data Factory to Simplicate, 180 days free.

How can I create an OData service URL query with filter for use in Azure Data Factory?

Azure Data Factory

The data from Simplicate is retrieved from Invantive Cloud into Microsoft Azure Data Factory. This works for Simplicate, but also for many other cloud platforms such as Exact Online, Salesforce or ActiveCampaign as described step-by-step on Retrieve a specific Exact Online OData4 data set using Azure Data Factory.

Are you looking for filtering OData on Power BI: a more practical guide is available at Reduce data volume and improve performance of your Power BI report by Filtering.

Make sure you understand the structure of Invantive Cloud before starting to optimize the performance and reducing the download. Learn the structure on Invantive Cloud Structure. It has a nice picture in it too!

Azure Data Factory enables you to transport and route the data from the OData connector of Invantive Cloud and other OData producers (OData sources) to your data lake or a database on your SQL Server instance.

Invantive Cloud is an OData producer (OData source): it bridges the Simplicate internal API protocols to OData, adds metadata and production features such as caching, retries and logging.

Note that there are two versions of the OData-protocol with slight differences. Older producers (sources) work with OData version 3, but OData 4 is strongly recommended for current work since it has more operators, provides better throughput and has a larger remaining lifetime while still evolving to become even more advanced.

The OData v3 specification is available at OData Version 3.0 · OData - the Best Way to REST. The approved OData v4 specification is available at Documentation · OData - the Best Way to REST, while upcoming changes can be found at GitHub - OData/odataorg.github.io: Staging site and collaboration repository for http://www.odata.org.

OData v4 is supported for use by Azure Data Factory, Power Query, Power BI and all other known recent versions of Microsoft products.

For the purpose of filtering, an OData query is:

  • First rewritten into Invantive SQL, translating each OData filter into a filter on the SQL where-clause.
  • Then Invantive SQL is executed against existing caches and when necessary API calls on the web services source of Simplicate are performed by the Invantive driver (kind of ODBC).

Note that Invantive Cloud always uses the current version 4 of OData in the OData connector. The actual data source may be using an API with OData 2 or OData 3 or even XML, SOAP or just JSON or even be a native relational database format like PostgreSQL, IBM DB2 or SQL Server.

Server-side Filtering

The translation by Invantive SQL into calls on the Invantive driver (compare it with ODBC-driver use) abstracts the actual protocol away.

This translation into Invantive SQL as intermediate language for your requests using Azure Data Factory before forwarding the request (when necessary for reloading caches) to the actual data source also supports forwarding of the filter to the data source. For example, a OData filter with an equals operator is rewritten into an equality operator in Invantive SQL. The equality is forwarded to the data source with many APIs. Not all APIs have extensive filtering support, but Simplicate does. Other popular data sources with fine filtering are AFAS Profit, Twinfield, Chargebee, Visma.net Financials, Visma eAccounting and Exact Online. This forwarding of filters is called “server-side filtering”.

This concept of forwarding filter criteria (called “server-side filtering”) does not apply always.

For instance, filters on fields added manually by Invantive are most often not forwarded. There are some exceptions like DivisionLabel on Exact Online. Although it is not an actual column in the Exact Online API filters on it are translated into specific divisions to be queried.

Neither are all APIs capable on filtering on any field. For instance, Exact Online has column-specific exceptions. But also Teamleader has considerable limitations.

Whether filters on fields can ultimately be forwarded to the cloud platform can be found in the documentation. Look at the column CAN_FILTER_SERVER_SIDE in SystemTableColumns@DataDictionary.

Finally note that server-side filtering is not possible with many more advanced filters, such as complex combinations of and, or and maybe even calculations. Invantive SQL has an extensive optimizer which simplifies expressions, performs peephole optimizations and rewrites queries, but still there are many thinkable constructs that can not be optimized yet. During regular monitoring we try to identify further optimization possibilities that impact many users and try to optimize these over time too.

OData Filter

The remaining step is to construct the OData $filter. Assuming you want to achieve this effect in SQL:

select *
from   table
where  created_at >= VALUE

you first need to check the data type of created_at.

You will notice that the Simplicate API has a typical twist: the data type of dates are not actually dates but a text data type. The good news is that the date’s text representation orders lexicographically identical to the dates as a date data type, so for instance January 1, 2020 comes before December 31, 2020, both as text representation as well as date. You can find more information on the date format of Simplicate on Limit Data load from Simplicate on Invantive Cloud to Azure Data Factory.

Translate Filter to OData in Power BI

The SQL statement needs to be translated into OData. The OData implementation of Power BI and Power Query is well done, so the easiest way to do so is to use an OData data source in Power BI and then watch the actual OData queries sent to the OData service.

The steps to set up an OData filter on Simplicate (or any other supported data source) are:

The watching is most easily done using the Invantive Bridge Online monitor of Invantive Cloud, so execute the following steps:

/invantive-simplicate-1/odata4/Simplicate.Hours.Hours@spe?$filter=projectservice_name%20eq%20%27Consultancy%20op%20locatie%27&$top=1000

The OData filter to use is the text starting at $filter. The filter will have been escaped for use in an URL, so you need to unescape the text. You can do so manually or use an online service like https://www.url-encode-decode.com/.

Since July 2021, the monitor also displays the OData filter next to the SQL generated for it as shown on More insight with new Bridge Online Monitoring.

Manually Construct an OData Filter

An OData filter can also be constructed manually. The OData specification is online available. The filters are part of the section 11.2.6 on OData queries.

The basics are to specify your filters after $filter in the URL. Filters can be combined using and and or, but also use criteria consisting of a left-hand side (for instance a column name), an operator like eq for “equals” and a right-hand side (for instance a constant value). Filter groups can be composed using the grouping operators ( and ).

The OData implementation of Invantive Bridge supports many OData filter operators, but not all. For instance, hassubset is not available. In case you need include a filter that is not supported please report to Invantive support with a short use case. Depending on the complexity and viability of the use case we will consider adding that specific OData filter to Invantive Bridge Online.

As a workaround you can of course always use a filter within Azure Data Factory itself to filter your data to your needs.

Construct an OData Filter using Query Tool

All recent Invantive SQL based products include a driver for OData data sources on the Bridge Online service. They enable you to retrieve data from Invantive Cloud using Invantive SQL and the OData bridge.

Execute the following steps:

  • Start Invantive Query Tool.
  • Open the database group “Various”.
  • Choose the database “Invantive Bridge Online”.
  • Enter the database parameters as follows:
  • Click on “Connect”.
  • Construct your database query like:
select * 
from   [Hours@Spe] 
where  created_at >= '2020-11-01'
  • Note that the table name is Hours@spe including spe, since the spe is interpreted on the Bridge Online service. When you have a local alias such as bol for Bridge Online, the full name would be [Hours@spe]@bol.
  • Invantive SQL translates the call into an OData query, including a filter.
  • The actual OData query can be found on Bridge Online monitoring.
  • And the actual OData query can be retrieved from the local Invantive SQL instance using:
select url
from   sessionios@datadictionary

with result:

Some OData Filter Examples

As an inspiration the following list of example OData filters for use with queries may help you:

  • ExactOnlineREST.Cashflow.PaymentsBulk@eol?$filter=DivisionCompanyName ne 'ACME Holding' and InvoiceDate ge 2019-01-01T00:00:00+01:00: retrieve all payments after January 1, 2019 across all partitions except the holding.
  • ExactOnlineREST.Financial.GLAccountClassificationMappings@eol?$filter=GLSchemeDescription eq 'G/L Account Scheme': retrieve all general ledger account classifications for the reporting schema “G/L Account Scheme”.
  • ExactOnlineREST.Financial.ExchangeRates@eol?$filter=Created gt 2020-01-01T00:00:00+01:00: retrieve all exchange rates created after January 1, 2020.
  • Teamleader.V1Flat.ProjectMilestones@tlr?$filter=responsible_user_id eq 1 or responsible_user_id eq 2: retrieve all milestones on Teamleader projects with the users with ID 1 or 2 being responsible.
  • ExactOnlineREST.FinancialTransaction.TransactionLinesBulk@eol?$filter=FinancialYear ge 2018 and Division eq 123456: retrieve all financial transactions for the financial year 2018 of Exact Online company 123456.

Apply OData $filter using Azure Data Factory on OData Source

Using the filter constructed for the OData service, the query can be adapted like in the following picture (sample using Data Factory on Exact Online France):

Een bericht is gesplitst naar een nieuw topic: Error itgenstp015