Faster OData downloads with smaller size with small adjustment in Power BI

forums · 18 januari 2022 om 19:36

Invantive Cloud offers access through OData to dozens of (cloud) platforms. We use OData version 4.01 with extensions from the next 4.02 version.

By default, Power BI requests the value of all fields. However, often a value is missing. The column name plus a value “null” is then returned.

With a small modification to the OData.Feed statement in the Power BI Advanced Editor, the download can be accelerated and also significantly reduced in volume by removing the overhead of missing values and their column names.

Customizing `OData.Feed` in Power BI Advanced Editor.

An OData.Feed statement looks like this in the Advanced Editor:

let
    Source = OData.Feed("URL", null, [Implementation="2.0"]),

By adding , ODataVersion=4, OmitValues=ODataOmitValues.Nulls and immediately also setting a higher timeout as described in Avoid timeout error on Power BI OData download and applying Link downloads to Power BI data sets you then get:

let
    Source = OData.Feed("URL", null, [Implementation="2.0", ODataVersion=4, OmitValues=ODataOmitValues.Nulls, Headers=[Referer = "change-to-your-unique-id-for-the-source" ], Timeout=#duration(0,4,0,0)]),

Faster and Smaller OData Downloads Outside of Power BI

It is also possible to make use of the faster and smaller downloads outside of Power BI by omitting missing values and the corresponding column names.

For this purpose, the following HTTP headers should be set:

OData-MaxVersion to the value 4.0.
Preferto the valueomit-values=nulls`.

Invantive Bridge Online will then set the value omit-values=nulls in the HTTP header Preference-Applied in the data set to confirm the request.

Measurements

A number of measurements were performed to determine the impact of omitting the missing values.

Table	Size before (KB)	Compressed size before (KB)	Size after (KB)	Compressed size after (KB )
Dummy.Demo.Persons	829	101	428	71
REST: TransactionLinesIncremental	6,738	710	4,531	652
REST: TransactionLinesBulk	15,104	1,857	9,971	1618
XML: GLTransactionLines	371,791	33,537	6,758	388
Twinfield	1200	80	75	6

The first table is a simple table with personal data and limited denormalization.

The other three tables each contain the same approximately 6,000 general ledger lines from the Exact Online accounting package.

The measurements show that tables based on a well-normalized data model have relatively little gain in download size (15-30% lower for Persons and TransactionLinesIncremental).

Similarly, the table TransactionLinesBulk with relatively large amounts of denormalized data will be only 15% smaller. This is consistent with the expectation that the compression algorithm will merge the duplicate data by denormalization to a certain level.

However, for the table GLTransactionLines the gain is enormous. This is a “sparse” table with many fields without values. Due to lacking definitions in the Exact Online metadata definitions (XSD), it is not known which fields are actually present with a value and very many fields will be returned while only a relatively small number are filled. The compression does ensure that the duplicate data is merged as much as possible, but a generic compression algorithm runs into its limits here.

The download by omitting redundant empty field values leads to an almost 100x smaller download for the XML format. It is quite measurable that after this optimization, the table TransactionLinesBulk suffers relatively from the denormalized nature. It is and remains over 2x as large as the incremental download table and about 4x as large as the (somewhat more limited in terms of meaning elements) XML version of the general ledger transactions.

In general, all tables would still benefit from support for the Brotli compression protocol by Power BI and honoring of ETag and Cache-Control HTTP response headers by Power BI. This could reduce the amount of network traffic by about 10% for production use and a factor of 10 for development work by data specialists.

Availability

The new optimization capabilities can be added now without risk.

The release supporting the new optimization capabilities is expected to be in production by January 21, 2022.

Topic		Antwoorden	Weergaven
Snellere OData downloads met kleinere omvang met kleine aanpassing in Power BI Wiki (nl) power-bi , performance , invantive-cloud , odata4	0	850	18 januari 2022
Reduce download size for Power BI reports by selecting solely the needed columns over OData Wiki (en) power-bi , performance , invantive-cloud , invantive-bridge-online , power-query	1	1620	27 november 2024
Downloads verkleinen voor Power BI-rapporten door alleen de benodigde kolommen te selecteren via OData Wiki (nl) performance , power-bi , invantive-bridge-online , invantive-cloud , power-query , exact-online	0	325	13 oktober 2023
Overview of Power BI Performance and Download Size Improvement Techniques Wiki (en) power-bi , power-query , invantive-cloud , invantive-bridge-online , performance	1	1952	11 februari 2022
Faster data processing from Invantive Cloud on selected OData clients Wiki (en) performance , power-bi , invantive-cloud	0	705	4 november 2021

Faster OData downloads with smaller size with small adjustment in Power BI

Customizing OData.Feed in Power BI Advanced Editor.

Faster and Smaller OData Downloads Outside of Power BI

Measurements

Availability

Gerelateerde topics

Customizing `OData.Feed` in Power BI Advanced Editor.