Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content
Announcements
WEBINAR June 25, 2025: Build on Apache Iceberg with Qlik Open Lakehouse - REGISTER TODAY
cancel
Showing results for 
Search instead for 
Did you mean: 
bvendeza
Contributor II
Contributor II

Duplicate lines doubling amount in script due to left join

Hello!

I'm having a slight issue with the output my script is producing and hopefully, someone has an idea as to how to tackle this issue.

The actual script is more complex than the sample I'm about to provide, but the logic behind it remains the same.

I have a business logic script that starts off with Table A. This table is the basis and contains one field (invoice amount) that needs to be overridden with the correct values- hence the creation of this BL.

To quickly visualize it for everyone, the entries for table A end up looking like this:

Requisition Doc | Requisition level Category | Invoice Amt | RefID | RefLev | OrderID | OrderLev

4589762                     1                          E                500.05           75635        01        2075624    1000


Invoice Amt will be renamed to [Invoice Amt_Past] to easily differentiate it from the new Invoice Amt that will be generated as the script progresses.

The initial primary keys - to allow the succeeding tables to link back to Table A - are also generated here. 
[Requision Doc] & '/' & [Requisition Item] as %RecDoc%RecItem

[RefID] & '/' & [RefLev] as %RefID%RefItem

[OrderID] & '/' & [OrderLev] as %OrderID%OrderLev

 

The new invoice amount will come from two different tables (Table B & Table C), depending on the category it has been assigned in Table A. The new invoice amount for Category 'E' documents will be pulled from Table B, and the new invoice amount for documents with Category 'S' will be taken from Table C.

This wouldn't be an issue, but the category field is currently not being generated in our base table, which makes it unavailable to use. So we had to work around this by saying "if there's an amount in Table B for this document, take this, otherwise, take the amount in Table C".

Which worked fine up until a few days ago, which we'll get to in a little bit.

For table B, the entries will look like this:

RefID  |  RefLev  |  Amount_Ref

75635          01            453.12

 

We create the key %RefID%RefItem, grab the new amount (Amount_Ref), and left join it to Table A.


Getting to Table C requires an additional table in between (Table C5), because we will need to link Table C to Table C5 in order to get it connected back to Table A.
So we start off with Table C5:

OrderID  |  OrderLev  |  EventID  |  EventLev

2075624          1000         888456        00100

2075624          1000         888456        00200

 

The following keys are then created: %OrderID%OrderLev and %EventID%EventLev

Table C:

EventID  |  EventLev  |  Amount_Event

888456           00100                0
888456           00200            453.12

 

Key %EventID%EventLev is created and left joined to Table C5.

A third table is then generated, resident loading [%OrderID%OrderLev] and Amount_Event from C5, which we then left join back to Table A.

But because of the two lines that exist in Table C5, when the last left join happens, my Amount_Ref cost DUPLICATES. Doubling when it really shouldn't. Due to the table now looking like this:

Requisition Doc  | Requisition level  |  RefID  |  RefLev  |  OrderID  |  OrderLev  |  Amount_Final

4589762                              1                   75635         01        2075624          1000              453.12

4589762                              1                   75635         01        2075624          1000              453.12

 

Rather than the total just being 453.12, it's now 906.24. The left join is necessary because at the end of the script, all of the amounts gets consolidated into 1 table, generating just a single field: Amount_Final that we then reference in the data model later on for master measurers and visualizations.

I've sent a request in to get that category added to the raw script, but until we can get that included, I unfortunately need to work around the system. Does anyone have any suggestions as to how I can eradicate the duplicate line? Loading distinct and even removing the SUM and aggregations inside the code itself did not work 😞

Labels (4)
2 Replies
Bhushan_Mahajan
Creator II
Creator II

@bvendeza Mathematically You can take Average over sum so even if multiple records are there you will get the right sum since your amount value is same for multiple entries.

marcus_sommer

I wouldn't tend to join these tables else to map them. It has no risks to change the number of records and the mapped value could be immediate evaluated instead needing an afterward load-step. Further mappings could be nested in various ways, for example within their third parameter might the next mapping be called.