Custom logic triggers on data synchronize statement for easier data integration

Go to Dutch version

The synchronize-statement has been extended by options to include custom logic written in Invantive PSQL. The PSQL-custom logic is executed when an event is triggered from insert, update and delete. Separate trigger custom logic can be specified to be executed before and after the event, plus instead of the event.

The use of custom logic for triggering events has the following advantages in complex data synchronization or data integration scenarios, increasing the usability as an one-stop integration platform for business data:

  • eases data-integration tasks,
  • reduces amount of labor required for complex synchronization tasks
  • increases serviceability, auditability and reliability.

The new functionality is available starting on release 22.1 BETA and in the future next production release.

Existing Synchronization Functionality

The release 22.0 data synchronization functionality of synchronize remains fully maintained and supported.

Using synchronize, a single SQL-statement can unidirectionally or bidirectionally compare and align the contents of two tables. This alignment through data synchronization ensures that the values in the table for identical-named columns match (allowing for small differences in data format such as long and short). Examples for data sync are available in:

Existing statements will continue to synchronize data identically without code changes. Synchronize stays the preferred solution for batchwise data synchronization. Please use the other Invantive SQL means for real-time data synchronization, such as Invantive App Online catching webhook events and translating them into actions on tables.

Data Syncing Challenges

Synchronizing data is a powerful concept for data integration, with highly valued business value. However, the data synchronization process can pose additional requirements above the 22.0 functionality of synchronize, such as different tables to read and write from, an option for dry-run and/or action auditing.

Different Tables to Read and Write from

A typical challenge for data synchronization tools is when the table (or API) that must be adapted is different from the table used to compare the data with. For instance, some platforms have (relatively slow) tables that allow insert, where the comparison can use an optimized read-only version. A popular example of this are the Exact Online transaction tables; whereas the (read-only) *Incremental tables easily return large data volumes such as millions of transactions in minutes, but where the read/write version is limited to at most 60 rows per second and a cumulative read use of at most 300.000 rows per day.

Using custom logic for triggering events, the synchronization can read from an incremental table such as AccountsIncremental and make changes through the (slower) Accounts-table. This will run up-to hundreds of times faster, while reducing strain and increasing scalability across large environments.

Of course, in this scenario it is assumed that both tables over time contain more-and-more identical data (and of course the same data straight after when the sync finishes). The reducing number of changes needed largely reduces the amortized runtime with the custom logic.

Dry-run and Action Auditing

Another challenge for business data synchronization tools is that the synchronizes runs fine, but given the complexity of the synchronization process it is hard to audit the actual process. Or even worse, sometimes the list of changes has to be signed off before the actual synchronize can take place.

Using custom code to sync data, the changes can be shown in advance before executed and an audit trail of the actual changes can easily be recorded, even when the target platform lacks auditing features.

Sample Custom Logic

The synchronize-statement synchronizes the source table to the target table, using custom logic for all triggering events. There are nine possible triggers on a synchronize, namely every combination of:

  • change event insert, update and delete,
  • execution moment before, instead of and after the change event.

At the end of the synchronize-statement the triggers are specified after the keyword triggers as the execution moment and the change event. For instance, the following custom logic is executed before every insert:

synchronize
...
triggers
before insert
begin
  ... code ...
end;

The custom code can refer to the old and new values using the prefixes :old and :new respectively. Please note that :old and :new depend on the synchronization direction. Especially for bidirectional synchronization, the source and target table may differ in list of columns continuously. Columns not available in the selected source or target with :old and :new references will evaluate to null.

For instance, when a table contains the column name, the following code will print alterations of the name:

synchronize
...
triggers
before update
begin
  if ( :old.name is null and :new.name is not null )
     or
     ( :old.name is not null and :new.name is null )
     or
     :old.name != :new.name
  then
    dbms_output.put_line('Name is changed from ' || :old.name || ' to ' || :new.name || '.');
  end if;
end;

Sample Custom Logic

The sample below synchronize the source to the target table using triggers. Please create and fill the tables first like shown at the end of this post before syncing data.

The result of execution in this case is always:

before insert on ID A
instead of insert on ID A: 5
after insert on ID A
before update on ID B
instead of update on ID B: 4 → 7
after update on ID B

Data synchronize statement:

sync source@inmemorystorage
to   target@inmemorystorage
with insert or update or delete
identified
by   dummy_string
triggers
before insert
begin
  dbms_output.put_line('before insert on ID ' || :new.dummy_string);
end
instead of insert
begin
  dbms_output.put_line('instead of insert on ID ' || :new.dummy_string || ': ' || :new.dummy_int32);
end
after insert
begin
  dbms_output.put_line('after insert on ID ' || :new.dummy_string);
end
before update
begin
  dbms_output.put_line('before update on ID ' || :new.dummy_string);
end
instead of update
begin
  dbms_output.put_line('instead of update on ID ' || :new.dummy_string || ': ' || :old.dummy_int32 || ' -> ' || :new.dummy_int32);
end
after update
begin
  dbms_output.put_line('after update on ID ' || :new.dummy_string);
end
before delete
begin
  dbms_output.put_line('before delete on ID ' || :old.dummy_string);
end
instead of delete
begin
  dbms_output.put_line('instead of delete on ID ' || :old.dummy_string);
end
after delete
begin
  dbms_output.put_line('after delete on ID ' || :old.dummy_string);
end

Sample Table Creation

In preparation for synchronization, the following SQL code creates two tables (source and target) with the following contents.

Table Source

dummy_string dummy_int32 dummy_datetime
A 5 now
B 7 now
C 9 now

Table Target

dummy_string dummy_int32
B 4
C 9

Table Creation Code

create or replace table source@inmemorystorage
as
select 5     dummy_int32
,      now() dummy_datetime
,      'A'   dummy_string
union all
select 7     dummy_int32
,      now() dummy_datetime
,      'B'   dummy_string
union all
select 9     dummy_int32
,      now() dummy_datetime
,      'C'   dummy_string

create or replace table target@inmemorystorage
as
select 'B'   dummy_string
,      4     dummy_int32
union all
select 'C'   dummy_string
,      9     dummy_int32

Bulk Loading

In general, the custom logic triggers will be called on a row basis, so in case there are two rows to be updated the calling order will be:

  1. before update row 1
  2. (instead of) update row 1
  3. after update row 1
  4. before update row 2
  5. (instead of) update row 2
  6. after update row 2

However, Invantive SQL provides bulk loading support on various platforms such as SQL Server, PostgreSQL and Oracle RDBMS (native driver). In that case, inserts are bundled in batches. The size of each batch can be specified using the batchsize keyword (documentation).

Each batch is typically processed as an indivisible unit. However, it is not guaranteed to be an atomic transaction. On failure, one or more rows in the batch will fail to load and some will succeed. In general, it is impossible to find out which ones failed and/or are valid without re-reading the actual database contents.

When using no instead of trigger and bulk loading with a batch size larger than 1, the calling order for two rows to be inserted will be:

  1. before insert row 1
  2. before insert row 2
  3. bulk insert row 1 and 2
  4. after insert row 1
  5. after insert row 2

When there are more rows to upload than the batch size allows, the calling order will be the same, but only for the rows in a batch. Subsequent batches will repeat the order from the start with different rows.