Pre-load data in OData-cache to avoid Power BI timeouts

The wiki explains how to pre-load cloud data in the OData-cache of Invantive Cloud for lightning fast retrieval at netwerk capacity speeds of cloud data by Power BI.

The OData-data source of Power BI, Power Query, Azure Data Factory and the Power BI Service offers great functionality and filtering options. However, the data amounts handled by Invantive Cloud-users sometimes exceed the current capabilities and allocated capacity on Azure.

Look at Facts

First step is to always first determine what is happening and whether it aligns with your actual business goals: look in the Power BI Service, but also look at Bridge Online Monitoring to see whether the data was shipped successfully, in what amount of time and with the expected quantity and quality of data.

Tune Performance

Next step is to apply performance optimizations. The start page to educate yourself about a wide range of possible performance optimization is Overview of Power BI Performance and Download Size Improvement Techniques.

Increase Timeout for more Runtime

The timeout on refresh time for Power BI for OData-downloads is only 10 minutes and not centrally configurable in Power BI Desktop or Power BI Service, nor on a data source.

Next step is to make sure that time out values are increased from the default 10 minutes to a few hours, to grant large downloads or slow API’s sufficient time to get the data across when refreshing data. The configuration of OData feed time out in M of a dataset is described on Avoid timeout error on Power BI OData download.

Pre-load Data in Cache

Last step is to pre-load the data in the OData-cache of Invantive Cloud. Pre-loading the data into the OData-cache ensures that the data is readily available at a speeds up to 500 MB/sec of compressed data when Power BI requests it. No interaction with a cloud app API is necessary; the data is lingering around on the disks in an encrypted format.

The use of pre-loaded data effectively renders all Power BI timeouts irrelevant, since within 10 minutes 60 gigabytes of uncompressed data can be downloaded even across a modest 100 megabit link.

However, the PowerBI Service can still fail for large volumes of data, since a data set of 60 GB uncompressed data can take longer to process within PowerBI Service than allowed by Azure or trigger a failure due to a software bug. You still might need to apply optimizations in your data sets, filters and transformations to avoid PowerBI Service itself from timing out. Changing to a more expensive plan such as Power BI Premium per User (PPU) often helps.

In a future release, pre-loading data into cache might be available as standard functionality to supplement the current pull-principle.

Collect OData Query URLs

Execute the following steps to pre-load cloud data in OData cache:

  • Open Bridge Online Monitoring.
  • Sort your requests descending by duration.
  • Select each request that you need to pre-load, but restrict yourself to the top-10 slow ones.
  • Right-click on the hyperlink next to “Path”.
  • Choose “Copy hyperlink”.
  • Paste all hyperlinks into a text file.

The hyperlinks will resemble the following when you have selected a column list:

https://bridge-online.cloud/acme-exact-online-3/odata4/ExactOnlineREST.Incremental.ItemsIncremental@eol?$filter=Division%20eq%20123456&$select=Timestamp%2CAverageCost%2CBarcode%2CClass_01%2CCode%2CCostPriceNew%2CCostPriceStandard%2CCreated%2CCreator%2CDescription%2CDivision%2CEndDate%2CExtraDescription%2CFreeTextField_01%2CFreeTextField_02%2CFreeTextField_03%2CFreeTextField_04%2CFreeTextField_05%2CFreeTextField_06%2CFreeTextField_07%2CGLCosts%2CGLRevenue%2CGLStock%2CGrossWeight%2CID%2CIsBatchItem%2CIsFractionAllowedItem%2CIsMakeItem%2CIsNewContract%2CIsOnDemandItem%2CIsPackageItem%2CIsPurchaseItem%2CIsSalesItem%2CIsSerialItem%2CIsStockItem%2CIsSubcontractedItem%2CIsTaxableItem%2CIsTime%2CIsWebshopItem%2CItemGroup%2CModified%2CModifier%2CNetWeight%2CNetWeightUnit%2CNotes%2CPictureName%2CPictureThumbnailUrl%2CPictureUrl%2CSalesVatCode%2CSearchCode%2CSecurityLevel%2CStartDate%2CUnit%2CUnitType%2CStatisticalCode%2CStatisticalNetWeight%2CStatisticalUnits%2CStatisticalValue

Or a more simple one, retrieving all columns with one filter step:

https://bridge-online.cloud/acme-exact-online-3/odata4/ExactOnlineREST.Incremental.ItemsIncremental@eol?$filter=Division%20eq%20123456

Create a Batch File to Pre-load OData Cache

Alter the text file create to become a batch file such as c:\temp\preload.bat, using the following template:

set CLOUD_USER=myinvantivecloudaccount
set CLOUD_PASSWORD=verysecret
set DUMMY_OUT_FILE=c:\temp\dummy.bin
set ENABLE_ZIP=-H "Accept-Encoding: gzip"
set OTHER_OPTIONS=--retry 3 --silent

curl %ENABLE_ZIP% %OTHER_OPTIONS% -u %CLOUD_USER%:%CLOUD_PASSWORD% --output %DUMMY_OUT_FILE% --url https://bridge-online.cloud/acme-exact-online-3/odata4/ExactOnlineREST.Incremental.ItemsIncremental@eol?$filter=Division%20eq%20123456

You can also compose your own URLs from elements using a template such as:

set CLOUD_USER=myinvantivecloudaccount
set CLOUD_PASSWORD=verysecret
set DUMMY_OUT_FILE=c:\temp\dummy.bin
set ENABLE_ZIP=-H "Accept-Encoding: gzip"
set OTHER_OPTIONS=--retry 3 --silent

set DATABASE_PREFIX_URL=https://bridge-online.cloud/acme-dummy/odata4

set TABLE_NAME=Dummy.Demo.ProjectPhases
set ALIAS=dmy
set FILTERS=?$filter=pss_aangemaakt_door%20eq%20%27system%27

set FULL_URL=%DATABASE_PREFIX_URL%/%TABLE_NAME%@%ALIAS%%FILTERS%

curl %ENABLE_ZIP% %OTHER_OPTIONS% -u %CLOUD_USER%:%CLOUD_PASSWORD% --output %DUMMY_OUT_FILE% --url %FULL_URL%

Daily Pre-load the OData Cache

Create in the Microsoft Task Scheduler a daily job to pre-load the data from Windows. The creation of such a task is described for instance in Starting Invantive Data Hub scripts from Windows Task Scheduler - #2 door forums.

Considerations

Pre-loading the cloud data in OData-cache with curl actually downloads the data multiple times. With large volumes of data, the Fair Use policy might hinder this. In such cases, please open a discussion on these forums to enable Invantive to better understand your needs.

For the near future we are working on an idea to daily refresh the databases in the background. For more information refer to Daily open and query databases to check consistency and rotate refresh tokens.