Eliminating duplicate records based on date - Qlik Community

Correct_Answer · ‎2021-08-09

Hi team,

I have one data set with 78 thousand records, it includes duplicate records

for example,

dataset1:

id Saleid delivery date customer name city location

1234 S123. 12 June 2021 ram. Hyderabad. Hyderabad

1234. S123. 11 June 2021 ram Hyderabad Hyderabad

these are the examples of duplicate care records from this I need to load data into Qlik is minimum date records only

if I found duplicate records more than 10 then I need to load least delivery date records and needs to exclude all other records

can anyone help me with the logic to implement and load the data

thanks in advance

stevejoyce · ‎2021-08-09

//preceding load. first determine min date for desired dimension context (ex. id)

//second inner to table on id+mindate, leaving only the mindate per id
inner join (tbl)
load
id
, mindate as date
//,1 as flag_mindate
;

load distinct
id
,min(date) as mindate
resident tbl
group by id
;

exit script;

Correct_Answer · ‎2021-08-09

Hello @stevejoyce

Thanks for your help, but it hard to understand

Can you please explain a bit more with the attached sample DataSet.

stevejoyce · ‎2021-08-09

Sorry it's not explained for you to understand. I'm not sure how else to explain, here's the code updated with your data file:

//load excel data

data:
LOAD Sno,
[Sales order ID],
[Delivery date],
[Sales amount],
[Sales Quantity],
[Customer Name],
City,
Loaction,
[ZIP Code]
FROM
[..\Eliminating duplicate records with mindate.xlsx]
(ooxml, embedded labels, table is Duplicates);

//preceding load...

//first determine min date for each sales order id

//second do inner to table on sales order id + mindate, leaving only the mindate per id
inner join (data)
load
[Sales order ID]
, Min_Delivery_Date as [Delivery date]
//,1 as flag_mindate
;

load distinct
[Sales order ID]
,min([Delivery date]) as Min_Delivery_Date
resident data
group by [Sales order ID]
;