The wiki explains how to pre-load cloud data in the OData-cache of Invantive Cloud for lightning fast retrieval at netwerk capacity speeds of cloud data by Power BI.
The OData-data source of Power BI, Power Query, Azure Data Factory and the Power BI Service offers great functionality and filtering options. However, the data amounts handled by Invantive Cloud-users sometimes exceed the current capabilities and allocated capacity on Azure.
First step is to always first determine what is happening and whether it aligns with your actual business goals: look in the Power BI Service, but also look at Bridge Online Monitoring to see whether the data was shipped successfully, in what amount of time and with the expected quantity and quality of data.
Next step is to apply performance optimizations. The start page to educate yourself about a wide range of possible performance optimization is Overview of Power BI Performance and Download Size Improvement Techniques.
The timeout on refresh time for Power BI for OData-downloads is only 10 minutes and not centrally configurable in Power BI Desktop or Power BI Service, nor on a data source.
Next step is to make sure that time out values are increased from the default 10 minutes to a few hours, to grant large downloads or slow API’s sufficient time to get the data across when refreshing data. The configuration of OData feed time out in M of a dataset is described on Avoid timeout error on Power BI OData download.
Last step is to pre-load the data in the OData-cache of Invantive Cloud. Pre-loading the data into the OData-cache ensures that the data is readily available at a speeds up to 500 MB/sec of compressed data when Power BI requests it. No interaction with a cloud app API is necessary; the data is lingering around on the disks in an encrypted format.
The use of pre-loaded data effectively renders all Power BI timeouts irrelevant, since within 10 minutes 60 gigabytes of uncompressed data can be downloaded even across a modest 100 megabit link.
However, the PowerBI Service can still fail for large volumes of data, since a data set of 60 GB uncompressed data can take longer to process within PowerBI Service than allowed by Azure or trigger a failure due to a software bug. You still might need to apply optimizations in your data sets, filters and transformations to avoid PowerBI Service itself from timing out. Changing to a more expensive plan such as Power BI Premium per User (PPU) often helps.
In a future release, pre-loading data into cache might be available as standard functionality to supplement the current pull-principle.
Execute the following steps to pre-load cloud data in OData cache:
- Open Bridge Online Monitoring.
- Sort your requests descending by duration.
- Select each request that you need to pre-load, but restrict yourself to the top-10 slow ones.
- Right-click on the hyperlink next to “Path”.
- Choose “Copy hyperlink”.
- Paste all hyperlinks into a text file.
The hyperlinks will resemble the following when you have selected a column list:
Or a more simple one, retrieving all columns with one filter step, immediately after the download so that PowerBI’s query-folding can pass on the filter:
Alter the text file create to become a batch file such as
c:\temp\preload.bat, using the following template:
set CLOUD_USER=myinvantivecloudaccount set CLOUD_PASSWORD=verysecret set DUMMY_OUT_FILE=c:\temp\dummy.bin set ENABLE_ZIP=-H "Accept-Encoding: gzip" set OTHER_OPTIONS=--retry 3 --silent curl %ENABLE_ZIP% %OTHER_OPTIONS% -u %CLOUD_USER%:%CLOUD_PASSWORD% --output %DUMMY_OUT_FILE% --url https://bridge-online.cloud/acme-exact-online-3/odata4/ExactOnlineREST.Incremental.ItemsIncremental@eol?$filter=Division%20eq%20123456
You can also compose your own URLs from elements using a template such as:
set CLOUD_USER=myinvantivecloudaccount set CLOUD_PASSWORD=verysecret set DUMMY_OUT_FILE=c:\temp\dummy.bin set ENABLE_ZIP=-H "Accept-Encoding: gzip" set OTHER_OPTIONS=--retry 3 --silent set DATABASE_PREFIX_URL=https://bridge-online.cloud/acme-dummy/odata4 set TABLE_NAME=Dummy.Demo.ProjectPhases set ALIAS=dmy set FILTERS=?$filter=pss_aangemaakt_door%20eq%20%27system%27 set FULL_URL=%DATABASE_PREFIX_URL%/%TABLE_NAME%@%ALIAS%%FILTERS% curl %ENABLE_ZIP% %OTHER_OPTIONS% -u %CLOUD_USER%:%CLOUD_PASSWORD% --output %DUMMY_OUT_FILE% --url %FULL_URL%
Create in the Microsoft Task Scheduler a daily job to pre-load the data from Windows. The creation of such a task is described for instance in Starting Invantive Data Hub scripts from Windows Task Scheduler - #2 door forums.
Pre-loading the cloud data in OData-cache with
curl actually downloads the data multiple times. With large volumes of data, the Fair Use policy might hinder this. In such cases, please open a discussion on these forums to enable Invantive to better understand your needs.