Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi team,
I have one data set with 78 thousand records, it includes duplicate records
for example,
dataset1:
id Saleid delivery date customer name city location
1234 S123. 12 June 2021 ram. Hyderabad. Hyderabad
1234. S123. 11 June 2021 ram Hyderabad Hyderabad
these are the examples of duplicate care records from this I need to load data into Qlik is minimum date records only
if I found duplicate records more than 10 then I need to load least delivery date records and needs to exclude all other records
can anyone help me with the logic to implement and load the data
thanks in advance
//preceding load. first determine min date for desired dimension context (ex. id)
//second inner to table on id+mindate, leaving only the mindate per id
inner join (tbl)
load
id
, mindate as date
//,1 as flag_mindate
;
load distinct
id
,min(date) as mindate
resident tbl
group by id
;
exit script;
Hello @stevejoyce
Thanks for your help, but it hard to understand
Can you please explain a bit more with the attached sample DataSet.
Sorry it's not explained for you to understand. I'm not sure how else to explain, here's the code updated with your data file:
//load excel data
data:
LOAD Sno,
[Sales order ID],
[Delivery date],
[Sales amount],
[Sales Quantity],
[Customer Name],
City,
Loaction,
[ZIP Code]
FROM
[..\Eliminating duplicate records with mindate.xlsx]
(ooxml, embedded labels, table is Duplicates);
//preceding load...
//first determine min date for each sales order id
//second do inner to table on sales order id + mindate, leaving only the mindate per id
inner join (data)
load
[Sales order ID]
, Min_Delivery_Date as [Delivery date]
//,1 as flag_mindate
;
load distinct
[Sales order ID]
,min([Delivery date]) as Min_Delivery_Date
resident data
group by [Sales order ID]
;