Hi there,
I want to one-way sync some data FROM PostGre
TO Sql
.
I have un bunch of tables some of them are very small in # records so a CREATE OR REPLACE
is fine.
One table is an accounting table now with 600K records, and growing.
Using a CREATE OR REPLACE
works with the amount of records growing it will make unnecessary computing tasks (like deleting and re-inserting 99% of unchanged records), so it’s not nice.
For that I followed on the forums :
I tried the SYNCHRONIZE
that achieve a nice job like this:
CREATE OR REPLACE TABLE account_move@inmemorystorage
as select * from account_move@Postgre WHERE company_id IN (4)
SYNCHHONIZE account_move@inmemorystorage to account_move@sql
with insert identified by id
But It will take 23s to load 27K records (in this case filtered to company_id :4)
then the synchronize
statement will need 78 seconds to compare the source account_move@inmemorystorage
with 27K records to the destination account_move@sql
with 600K records.
In this specific case, I know that my source data has a unique id
, only ascending, and that when a record is created, no need to look if it has been deleted or updated. It is not possible in the workflows.
To achieve the sync job, I only need to look the last (or max) id
inserted, and select in the source all ids
that are above that one.
like :
INSERT INTO table account_move@sql as select * from account_move@postgre
WHERE id > select MAX(id) from account_move@sql
that should be way more faster that the nice synchronize
in this case.
but the select MAX(id) from account_move@sql
is a long statement, is seems that Query Tool is not passing the MAX operator to sql, instead downloading the all set of data, and doing the MAX on Query Tool side. for 600K records, it takes 80s to output the result.
doing a
select id from account_move@sql ORDER BY id DESC LIMIT 1
will produce the same behaviour, the results takes 84s to sort.
Am I on a wrong path ?
thanks !