Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

identifing dubble or more records

Hi Guys,

I need little help.

Please look at the excel file

There are four columns that have a that are in some cases exactly the same:

The colums are:

invoiceID

supplierID

date

cost

IF the records are excactly the same then 1 else 0.

By doing that I can quickly identify wich invoices have been booked double or more.

I tried the previous function, but that did work really wel.

I made an composite key of those four keys. and doing a count on the key. But I dont know whether, a count is going to be memory consuming.

Please look at the attach file. The marked yelllow records are the records I would like to see in a straight table

                                                                                                            

IdinvoicesupplierIddatecost
101474561-1-201550
111474561-1-2015100
121474551-1-2015100
131484562-1-201550
141494572-1-201547
151504581-1-2015150
161504581-1-2015150
171504581-1-2015150
181514592-1-201511
191524592-1-201511
1 Solution

Accepted Solutions
MK_QSL
MVP
MVP

Data:

Load

  AutoNumber(invoice & supplierId & date & cost) as AutoID

  ,* Inline

[

  Id, invoice,supplierId, date, cost

  10, 147, 456, 01/01/2015, 50

  11, 147, 456, 01/01/2015, 100

  12, 147, 456, 01/01/2015, 100

  13, 148, 456, 02/01/2015, 50

  14, 149, 457, 02/01/2015, 47

  15, 150, 458, 01/01/2015, 150

  16, 150, 458, 01/01/2015, 150

  17, 150, 458, 01/01/2015, 150

  18, 151, 459, 02/01/2015, 11

  19, 152, 459, 02/01/2015, 11

  20, 153, 456, 01/01/2015, 100

  21, 153, 456, 01/01/2015, 100

  22, 153, 456, 01/01/2015, 20

];

Left Join

Load AutoID, Count(AutoID) as TotalAutoID Resident Data Group By AutoID;

Final:

Load Id, invoice, supplierId, date, cost, If(TotalAutoID > 1, 1, 0) as Flag_Duplicate Resident Data;

Drop Table Data;

View solution in original post

4 Replies
MK_QSL
MVP
MVP

What you want to do as a result? Remove those duplicate records or only identify them?

MK_QSL
MVP
MVP

Data:

Load

  AutoNumber(invoice & supplierId & date & cost) as AutoID

  ,* Inline

[

  Id, invoice,supplierId, date, cost

  10, 147, 456, 01/01/2015, 50

  11, 147, 456, 01/01/2015, 100

  12, 147, 456, 01/01/2015, 100

  13, 148, 456, 02/01/2015, 50

  14, 149, 457, 02/01/2015, 47

  15, 150, 458, 01/01/2015, 150

  16, 150, 458, 01/01/2015, 150

  17, 150, 458, 01/01/2015, 150

  18, 151, 459, 02/01/2015, 11

  19, 152, 459, 02/01/2015, 11

  20, 153, 456, 01/01/2015, 100

  21, 153, 456, 01/01/2015, 100

  22, 153, 456, 01/01/2015, 20

];

Left Join

Load AutoID, Count(AutoID) as TotalAutoID Resident Data Group By AutoID;

Final:

Load Id, invoice, supplierId, date, cost, If(TotalAutoID > 1, 1, 0) as Flag_Duplicate Resident Data;

Drop Table Data;

Anonymous
Not applicable
Author

only identify them for business purpose only.

maxgro
MVP
MVP

1.png


A:

LOAD Id,      invoice,      supplierId,      date,       cost

FROM tmp3.xlsx (ooxml, embedded labels, table is Blad1);

Left Join (A)

load *, if(cnt>1,1,0) as Duplicated;

load

     invoice,      supplierId,      date,     cost,

     count(Id) as cnt

Resident A    

Group By      invoice,      supplierId,      date,     cost;