Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Eliminate Duplicate

I got the following question

I got two tables I would like to join

Table A : Ccontains details of documents booked in a ERP/SAP Accounting system

ERP document NumberAmountCurrency Status IN ERP
12345         45,00 € EURPaid
12346         50,00 € EURPaid
12347       105,00 € EUROn Hold

Table B : contains details of documents scanned & indexed in a Workflow Tool prior to booking in ERP/SAP Accounting system

ERP document NumberScanning IDAmountCurrency Status in Workflow
12345DCN0001    45,00 € EURCompleted
12345DCN0002    45,00 € EURRejected as it is a Duplicate
12346DCN0003    50,00 € EURCompleted
12347DCN0004  105,00 € EURUnder Approval
Not booked in ERPDCN0005    20,00 € EURRejected as it is a Invalid invoice

The problem is that sometimes the same invoice number is received/scanned  twice and then a new Scanning ID is created and the related ERP document number is indexed to it

When I do the join load in QW Script then I get two line for the same ERP document and that makes all Aggregation formulas to double count 


ERP document NumberCurrencyScanning IDAmountStatus IN ERPStatus in Workflow
Not booked in ERPEURDCN000520 Rejected as it is a Invalid invoice
12345EURDCN000145PaidCompleted
12345EURDCN000245PaidRejected as it is a Duplicate
12346EURDCN000350PaidCompleted
12347EURDCN0004105On HoldUnder Approval

Is there a way I can only load for a specific field “ERP Document number “ one of the two documents  ?

6 Replies
maleksafa
Specialist
Specialist

you can fix that from the script before loading the data, like when you are preparing tableB use Max(statusinWorkFlow) group by ERP document number, that way you will only have one status for each ERP document number, now max returns the max string, so in case you have a specific order for the statuses then you can use something other than the max but make sure that you group by ERP doc number.

same thing can be applied from the qvw part (i would recommend to go with the script part and not qvw), you can use aggregate function on the status or if condition to eliminate the duplication.

Colin-Albert
Partner - Champion
Partner - Champion

You do not need to join the data in QlikView.

Just load Table A and Table B into separate tables. Do not load the Amount & Currency fields from Table B, just load these fields from Table A. The data will be associated on the common field "ERP Document Number" and will sum correctly.

In most cases you do not need to use explicit joins in QlikView.

Not applicable
Author

Thanks Colin

But in Table B there are some invoices (those not booked in ERP) that are not in Table A so I still want to see mentioned fields if not in Table A so I cannot delete them ...

I thought I could solve with following script but still duplñicates appear , no sure why

ERP:

LOAD [ERP document Number],

     Amount,

     Currency,

     [Status IN ERP]

FROM

[RN Project.xlsx]

(ooxml, embedded labels, table is QWERP);

DART:

LOAD Distinct

[ERP document Number]

    

FROM

[RN Project.xlsx]

(ooxml, embedded labels, table is QWDART);

left join (DART)

LOAD

[ERP document Number],

     [Scanning ID],

     Amount,

     Currency,

     [Status in Workflow]

FROM

[RN Project.xlsx]

(ooxml, embedded labels, table is QWDART);

join (ERP) load * resident DART;

drop table DART;

Colin-Albert
Partner - Champion
Partner - Champion

The script below should get the data you require.

ERP:

LOAD

    [ERP document Number],

    [ERP document Number] as ERP2,

    Amount,

     Currency,

     [Status IN ERP]

FROM

[RN Project.xlsx]

(ooxml, embedded labels, table is QWERP);

// Add Scanning ID & status for Documents that have already been loaded from QWERP

join (ERP)

LOAD

[ERP document Number],

[Scanning ID],

[Status in Workflow]

FROM

[RN Project.xlsx]

(ooxml, embedded labels, table is QWDART)

where exists ( ERP2, [ERP document Number])

;

// Add Details for Documents missing from QWERP

concatenate (ERP)

LOAD

[ERP document Number],

[Scanning ID],

Amount,

Currency,

  [Status in Workflow]

FROM

[RN Project.xlsx]

(ooxml, embedded labels, table is QWDART)

where not exists (ERP2, [ERP document Number])

;

drop field ERP2 ;

Not applicable
Author

Thanks for the help collin but as a result i finally got the same output from start

ERP document NumberCurrencyScanning IDAmountStatus IN ERPStatus in Workflow
Not booked in ERPEURDCN000520 Rejected as it is a Invalid invoice
12345EURDCN000145PaidCompleted
12345EURDCN000245PaidRejected as it is a Duplicate
12346EURDCN000350PaidCompleted
12347EURDCN0004105On HoldUnder Approval
Not applicable
Author

Finally I saw what was wrong in my script ....I forgot to add Scanning ID on the firts load

ERP:

LOAD [ERP document Number],

     Amount,

     Currency,

     [Status IN ERP]

FROM

[RN Project.xlsx]

(ooxml, embedded labels, table is QWERP);

Thanks a lot ..You are so helpful people