Join AttachmentByUrl@eol and join jsontables creates duplicate entries

It seems the AttachmentByUrl@eol creates duplicates values by using join

in my setup (QueryTool 22.0.578):

select * from exactonlinerest..documents@eol dct
join exactonlinerest..documentattachmentfiles@eol dae on dct.id = dae.document
where dct.type = 40 AND dct.Created > dateadd('day','-1',sysdateutc) AND dct.SendMethod = 2

retrieves 4 results

adding the AttachmentByUrl@eol :

select * from exactonlinerest..documents@eol dct
join exactonlinerest..documentattachmentfiles@eol dae on dct.id = dae.document
join AttachmentByUrl@eol (dae.url || '&Download=1') dcta
where dct.type = 40 AND dct.Created > dateadd('day','-1',sysdateutc) AND dct.SendMethod = 2

retrieves 20 results

then adding the join jsontable

select count(fle.*), fle.Transactionnumber from exactonlinerest..documents@eol dct
join exactonlinerest..documentattachmentfiles@eol dae on dct.id = dae.document
join AttachmentByUrl@eol (dae.url || '&Download=1') dcta
join jsontable
      ('Transactions.[*]'
      passing blob_to_text(dcta.DocumentContents, 'Unicode')
      columns
           --bankAccountID varchar2 path '::BankAccountId'
           transactiondate datetime path '.TransactionDate'
          ,Description varchar2 path 'Description'
          --,rawData varchar2 path 'RawData'
          ,ownAccount varchar2 path 'Owner.BankEntity.AccountIdentification'
          ,SequenceNumber varchar2 path 'SequenceNumber'
          ,TransactionNumber varchar2 path 'TransactionNumber'
          --,TransactionIdentifiers varchar2 path 'TransactionIdentifiers'
          ,Type varchar2 path 'Type'
          ,amount varchar2 path 'Amounts.[*].Value'
          ,CounterpartyName varchar2 path 'CounterParty.Name'
          ,CounterpartyAccount varchar2 path 'CounterParty.BankEntity.AccountIdentification'
      ) fle
where dct.type = 40 AND dct.Created > dateadd('day','-1',sysdateutc) AND dct.SendMethod = 2
group by TransactionNumber

gives 175 results, with some items repeated 15 times.

Capture d’écran 2023-03-21 à 17.01.49

It seems adding select distinct somehow does not solve the issue:
it gives 75 results (down from 175) but some results are still shown 3,2 and few 1 times.

I think I found a path.
I’m testing right now, I’ll elaborate when found.

new findings:

  • the duplication caused by the join AttachmentByUrl@eol is confirmed

it duplicates 5 times the results

select * from exactonlinerest..documents@eol dct
join exactonlinerest..documentattachmentfiles@eol dae on dct.id = dae.document
join AttachmentByUrl@eol (dae.url || '&Download=1') dcta
where dct.type = 40 AND dct.Created > '2023-03-21T00:00:00' AND dct.SendMethod = 2

result : 20 rows

select * from exactonlinerest..documents@eol dct
join exactonlinerest..documentattachmentfiles@eol dae on dct.id = dae.document
where dct.type = 40 AND dct.Created > '2023-03-21T00:00:00' AND dct.SendMethod = 2

result : 4 rows

  • the select distinct clause not working as expected

w-We have to take into consideration that in PSD2 mode Exact retrieves bank transactions (and not statements) 4 times a day, whereas in the classic bank link it gets the statement only once a day.
In my above query I was extracting the SequenceNumber of the json file, and found out that during the day, a same transaction would have different position in the json file, thus different SequenceNumber.
As a consequence the select distinct was indeed working but excluding theses cases and not presenting the results as expected.

the good statement becomes:

select distinct dct.division, fle.* from  exactonlinerest..documents@eol dct 
join   exactonlinerest..documentattachmentfiles@eol dae on dct.id = dae.document
join AttachmentByUrl@eol (dae.url || '&Download=1') dcta 
join jsontable
      ('Transactions.[*]'
      passing blob_to_text(dcta.DocumentContents, 'Unicode')
      columns
           --bankAccountID varchar2 path '::BankAccountId'
           transactiondate datetime path '.TransactionDate'
          ,Description varchar2 path 'Description'
          --,rawData varchar2 path 'RawData'
          ,ownAccount varchar2 path 'Owner.BankEntity.AccountIdentification'
          --,SequenceNumber varchar2 path 'SequenceNumber' -> removed since the file is downloaded 4 times/day by Eol, the same transaction can be present in the 4 json files but with different Sequence number.
          ,TransactionNumber varchar2 path 'TransactionNumber'
          --,TransactionIdentifiers varchar2 path 'TransactionIdentifiers'
          ,Type varchar2 path 'Type'
          ,amount varchar2 path 'Amounts.[*].Value'
          ,CounterpartyName varchar2 path 'CounterParty.Name'
          ,CounterpartyAccount varchar2 path 'CounterParty.BankEntity.AccountIdentification'
      ) fle
where dct.type = 40 AND dct.Created > dateadd('day','-10',sysdateutc) AND dct.SendMethod = 2

This question was automatically closed after at least 2 weeks of inactivity after a possible solution was provided. The last answer given has been marked as a solution.

Please ask a new question via a separate topic if the problem occurs again. Please include a link to this topic in the new question by pasting its URL into the text.

Dit topic is 7 dagen na het laatste antwoord automatisch gesloten. Nieuwe antwoorden zijn niet meer toegestaan.