Faster loading from execute native

sda · 28 juni 2023 om 17:10

Is it possible to consider to implement a Bulk Insert method regarding Evaluate Native SQL Statements from Invantive PSQL - #2?

In my case I need to run a native statement to execute a JOIN at container side (Postgres) with filtering on two tables with million lines. By doing this I would like to prevent the Invantive client to download all data of each tables and make the join at Invantive level.

I tested an execution with a LIMIT 10 000 on the results given by the execute native;
but still it takes 40 seconds to execute the whole procedure. Doing such JOIN at container side with Postgres-clients takes less than 0,03 second.

My guess is that the loop inserts results one by one into the memorystorage thus takes time.

I tried a Bulk insert into runs@inmemorystorage but of course as it is for each r, that does not gives the expected result.

forums · 28 juni 2023 om 17:14

The performance with a for loop should be great, but maybe there is some other factor involved, such as network bandwidth (PostgreSQL statement execute on the native database server, while an execute native with a loop retrieves row-by-row).

Can you provide a working SQL / PSQL sample?

sda · 29 juni 2023 om 07:51

Sure, I’ll provide access to the test environnement.

from which IP address you’ll connect from
where can I send you credentials by secured mean

forums · 29 juni 2023 om 10:25

Thanks for the update, but a test environment is not necessary since the contents can not be reproduced on the forums.

Best is to provide the actual SQL-statements used and the number of rows in each of the tables used.

sda · 29 juni 2023 om 10:44

This is the PostgreSQL query:

select ML.Id
,      ML.Debit
,      ML.credit
,      ML.Balance
,      ML.Account_id
,      ML.Payment_id
,      ML.Journal_id
,      ML.Partner_id
,      ML.Move_id
,      ML.Date
,      ML.Invoice_id
,      ML.Quantity
,      ML.Name
from   account_move_line ML
left
outer
join   account_move MO 
on     ML.Move_id = MO.Id
where  MO.state LIKE 'posted'
LIMIT 10000

Theses are Odoo tables by the way.

That gives into Invantive’s:

create or replace table runs@inmemorystorage
( Id INT
, Debit Numeric
)
--
-- We only make the test on 2 fields.
--
begin
  for r in 
  ( execute native 'select ML.Id, ML.Debit, ML.credit, ML.Balance, ML.Account_id, ML.Payment_id, ML.Journal_id, ML.Partner_id, ML.Move_id, ML.Date, ML.Invoice_id, ML.Quantity, ML.Name
from account_move_line ML
LEFT JOIN account_move M ON ML.Move_id = M.Id
where M.state LIKE ''posted''
LIMIT 10000' DATACONTAINER 'Vega'
  )
  loop
    bulk insert into runs@inmemorystorage (Id, Debit) values (r.Id, r.Debit);
  end loop;
end;

select * from RUNS@InMemoryStorage

Output:

table account_move contains 789368 records
table account_move_line contains 1956568 records

My ultimate purpose is be to built a fast sync script from source (Odoo PostgreSQL) to destination (SQL Server) to insert only newly created content. I need to use server-side filtering commands (ie : if not implemented in Invantive’s PLSQL then use the execute native) to prevent large downloads from the source.

All this because my actual running script with a simple create and replace table runs quite long…

source	table name	records inserted	exec time(s)
vega	account_account	9432	3,05
vega	account_full_reconcile	131262	8,896
vega	account_invoice	186169	87,978
vega	account_invoice_line	284614	72,524
vega	account_invoice_payment_rel	147596	5,94
vega	account_move	789102	154,646
vega	account_move_line	1955772	618,412

forums · 20 juli 2023 om 18:52

The use of bulk insert in the loop per row does not really speed things up (although there is an unexplained little gain), since only one row is inserted per bulk insert. Bulk insert is intended for large volumes of rows, which are then split in batches of thousand(s) rows which are loaded through a specialized loader.

Measurements here where even a lot worse:

--
-- 23057 rows per second.
--
create or replace table tst@inmemorystorage
as
select *
from   calendar@datadictionary
where  1=0

--
-- 25 rows per second.
--
begin
  for r 
  in
  ( select *
    from   calendar@datadictionary
  )
  loop
    insert into tst@inmemorystorage
    (id, day_date, day_in_week, week_number, month_number, day_in_month, month_nr_day1_in_week, month_nr_day7_in_week, iso_week_number, day_in_year_number, year_number, year_nr_day1_in_week, year_nr_day7_in_week, iso_year_number, day_since_19000101_number, century_number, era)
    values
    (r.id, r.day_date, r.day_in_week, r.week_number, r.month_number, r.day_in_month, r.month_nr_day1_in_week, r.month_nr_day7_in_week, r.iso_week_number, r.day_in_year_number, r.year_number, r.year_nr_day1_in_week, r.year_nr_day7_in_week, r.iso_year_number, r.day_since_19000101_number, r.century_number, r.era)
    ;
  end loop;
end;

--
-- 50 rows per second.
--
begin
  for r 
  in
  ( select *
    from   calendar@datadictionary
  )
  loop
    bulk insert into tst@inmemorystorage
    (COLUMNS)
    values
    (r.COLUMNS)
    ;
  end loop;
end;

A nice extension would be that a syntax like the following would be possible:

bulk insert into TABLE
execute native ''

just like using for r in ( execute native.... An idea for this has been registered.

Alternative

As a workaround, one of the following can be used:

view,
temporary table.

View

Create a view in PostgreSQL that contains the output from the query. Then use:

--
-- Optionally preceed with a 'drop view' and 'create view' when more
-- flexibility is needed.
--
create or replace table MYTABLE@InMemoryStorage
as
select *
from   VIEW@postgresql

Temporary table

An alternative, which is more easily configured, is to insert the results in a PostgreSQL-table. Then retrieve that table like in:

begin
  execute native 'create table if not exists TEMPTABLE(...)';
  execute native 'insert into TEMPTABLE ... query ...';
  bulk insert MYTABLE@InMemoryStorage
  ( columns
  )
  select ...
  from   TEMPTABLE@postgresql
  ;
end;

Please note that on the first execution, the TEMPTABLE is introduced, but it’s existence is not seen by the Invantive SQL metadata register. A reconnect will be necessary to refresh the metadata.

The performance related to the original SQL-statements on the same device is then approximately 10.000 rows per second.

forums · 31 oktober 2023 om 13:04

Deze vraag is automatisch gesloten na 2 weken inactiviteit. Het laatste gegeven antwoord is gemarkeerd als oplossing.

Gelieve een nieuwe vraag te stellen via een apart topic als het probleem opnieuw optreedt. Gelieve in de nieuwe vraag een link naar dit topic op te nemen door de URL er van in de tekst te plakken.

system · 7 november 2023 om 13:05

Dit topic is 7 dagen na het laatste antwoord automatisch gesloten. Nieuwe antwoorden zijn niet meer toegestaan.

Topic		Antwoorden	Weergaven
Evaluate Native SQL Statements from Invantive PSQL Wiki (en) invantive-psql	1	974	28 juni 2023
Create or replace table dummy@inmemorystorage from execute native Questions performance , invantive-sql	1	113	22 mei 2024
Add execute native syntax to create or replace table, insert and bulk insert syntax Ideeën performance , invantive-sql	7	537	14 januari 2024
Execute dynamic SQL and PSQL statements Wiki (en) invantive-sql , invantive-psql	0	4019	19 januari 2022
Native Performance High-Volume Updates with Invantive UniversalSQL Wiki (en) invantive-sql	0	535	22 december 2020

Faster loading from execute native

Alternative

View

Temporary table

Gerelateerde topics