Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Correct_Answer
Contributor III
Contributor III

Eliminating duplicate records based on date

Hi team,

 

I have one data set with 78 thousand records, it includes duplicate records 

for example,

dataset1:

id     Saleid  delivery date customer name city location 

1234 S123.    12 June 2021 ram.                      Hyderabad. Hyderabad 

 

1234. S123.  11 June 2021 ram Hyderabad Hyderabad 

 

these are the examples of duplicate care records from this I need to load data into Qlik is minimum date records only 

 

 

if I found duplicate records more than 10 then I need to load least delivery date records and needs to exclude all other records 

 

can anyone help me with the logic to implement and load the data 

 

thanks in advance 

3 Replies
stevejoyce
Specialist II
Specialist II

//preceding load.  first determine min date for desired dimension context (ex. id)

//second inner to table on id+mindate, leaving only the mindate per id
inner join (tbl)
load
id
, mindate as date
//,1 as flag_mindate
;

load distinct
id
,min(date) as mindate
resident tbl
group by id
;


exit script;

Correct_Answer
Contributor III
Contributor III
Author

Hello @stevejoyce

 

Thanks for your help,  but it hard to understand 

Can you please explain a bit more with  the attached sample DataSet.

 

 

stevejoyce
Specialist II
Specialist II

Sorry it's not explained for you to understand.  I'm not sure how else to explain, here's the code updated with your data file:

 

//load excel data

data:
LOAD Sno,
[Sales order ID],
[Delivery date],
[Sales amount],
[Sales Quantity],
[Customer Name],
City,
Loaction,
[ZIP Code]
FROM
[..\Eliminating duplicate records with mindate.xlsx]
(ooxml, embedded labels, table is Duplicates);

 


//preceding load...

//first determine min date for each sales order id

//second do inner to table on sales order id + mindate, leaving only the mindate per id
inner join (data)
load
[Sales order ID]
, Min_Delivery_Date as [Delivery date]
//,1 as flag_mindate
;

load distinct
[Sales order ID]
,min([Delivery date]) as Min_Delivery_Date
resident data
group by [Sales order ID]
;