I want to one-way sync some data FROM
I have un bunch of tables some of them are very small in # records so a
CREATE OR REPLACE is fine.
One table is an accounting table now with 600K records, and growing.
CREATE OR REPLACE works with the amount of records growing it will make unnecessary computing tasks (like deleting and re-inserting 99% of unchanged records), so it’s not nice.
For that I followed on the forums :
I tried the
SYNCHRONIZE that achieve a nice job like this:
CREATE OR REPLACE TABLE account_move@inmemorystorage
as select * from account_move@Postgre WHERE company_id IN (4)
SYNCHHONIZE account_move@inmemorystorage to account_move@sql
with insert identified by id
But It will take 23s to load 27K records (in this case filtered to company_id :4)
synchronize statement will need 78 seconds to compare the source
account_move@inmemorystorage with 27K records to the destination
account_move@sql with 600K records.
In this specific case, I know that my source data has a unique
id, only ascending, and that when a record is created, no need to look if it has been deleted or updated. It is not possible in the workflows.
To achieve the sync job, I only need to look the last (or max)
id inserted, and select in the source all
ids that are above that one.
INSERT INTO table account_move@sql as select * from account_move@postgre
WHERE id > select MAX(id) from account_move@sql
that should be way more faster that the nice
synchronize in this case.
select MAX(id) from account_move@sql is a long statement, is seems that Query Tool is not passing the MAX operator to sql, instead downloading the all set of data, and doing the MAX on Query Tool side. for 600K records, it takes 80s to output the result.
select id from account_move@sql ORDER BY id DESC LIMIT 1
will produce the same behaviour, the results takes 84s to sort.
Am I on a wrong path ?