Fine-tune when data becomes obsolete for replicated tables

Problem

Despite hourly refresh of your Data Replicator environment, the data gets refreshed only during the course of a business day, whereas your Power BI reports are refreshed earlier on at a fixed time like 7 AM. Your Power BI users depend on having data of the previous business day, but the data lags behind one more day.

Solution

There are multiple elements considered whether to refresh the data of the combination of a table and partition (such as Exact Online company) using a normal refresh:

  1. Is a connection available to the data container (such as Exact Online)?
  2. Is the data considered obsolete?

The availability of a connection to the data container requires two pre-conditions:

  1. The Invantive program has been started: this can be done by running for instance Invantive Data Hub from a task at a suitable frequency.
  2. The data container is defined and opened with the database connection.: make sure the database definition as defined in the settings*.xml files includes and opens the data container.
  3. The database connection includes the use of Data Replicator: make sure the database definition as defined in the settings*.xml files has an attribute dataCacheConnectionString, such as:
    dataCacheConnectionString="provider=SqlServer;connectionString={Server=something.database.windows.net;Database=db;User Id=usr;Password=secret}"

A table partition is considered obsolete when the active version with the facts for the table partition has become obsolete. This information is available in the Data Replicator repository view dc_table_partition_versions_r.

Also, a table partition can be considered obsolete when an ods Invantive SQL-hint is given in an Invantive SQL statement, such as:

select /*+ ods(true, interval '1 seconds') */ * 
from   exactonlinerest..journals

A very short interval ensures that the journal data is considered obsolete and refreshed.

As an alternative, you can use a forced refresh such as:

alter persistent cache force refresh

This refreshes all combinations of tables and partitions for which the source data container is currently available and that have been registered in the Data Replicator repository for replication.

Example

A common scenario is to check whether the interval between refreshes is shortened for tables that allow smart strategies such as trickle loading or smart sampling. The following query on the Data Replicator repository provides an overview of:

select tpn.tbe_full_qualified_name
,      tpn.ptn_name
,      tpn.tpn_date_last_got_ready
,      tpn.tpn_date_becomes_obsolete
--
-- Table partition-specific default, used for all new versions.
-- Since versions come and go, this is the essential freshness
-- indicator to change.
--,      tpn.tpt_sec_before_obsolete
--
-- Table-specific default, used for all new table partitions.
,      tpn.tbe_sec_before_obsolete
--
-- Master default, used for all new tables.
--
, stg.stg_sec_before_obsolete
--
--
--
,      'update dc_table_partitions set sec_before_obsolete = ? where id = '
        + cast(tpn.tpt_id as varchar)
        + ' and sec_before_obsolete = '
        + cast(tpn.tpt_sec_before_obsolete as varchar)
        + ' /* '        + tpn.tbe_full_qualified_name
        + '/'        + tpn.ptn_name
        + '*/'
from   dc_table_partition_versions_r tpn
join   dc_settings_r stg
on     1=1
--
-- Ready version.
--
where  tpn.tpn_state = 'R'
--
-- Trickle loading or smart sampling.
--
and    tpn.tpn_seeding_approach_used != 'C'--
-- Has a high default value which allows reduced value.
--
and    tpn.tpt_sec_before_obsolete >= 4 * 3600

In the results you can see that TransactionLinesBulk obsoletes every 10 hours and Accounts every 5 hours:

Given the nature of smart strategies, these intervals can be refreshed to a very short interval like one hour. As of release 17.31 that is only possible through changing the repository directly in the database. In general, changing the repository using direct SQL statements is NOT allowed. In this specific scenario it is allowed. As an example, the last column of the query composes such as statement, such as:

update dc_table_partitions
set    sec_before_obsolete = ? /* Fill in value */
where  id = 15
and    sec_before_obsolete = 28800 /* ExactOnlineREST.FinancialTransaction.TransactionLinesBulk/737036*/

Run this statement to change the obsoletion period.