Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
As a Qlik virgin I need the help of you experts. I need to identify and set a flag against a duplicate record in a load script. Example ogf the table below. I need to check if the Id's are the same and if so i need to set a flag against the oldest date only and set the later date as OK. If the dates are equal i just need to set the first record as a duplicate. many thanks
Id | Auth Date |
C0018337 | 19-Jul-17 |
C0018337 | 27-Jul-17 |
CX211795 | 02-Jun-17 |
CX211795 | 02-Jun-17 |
CX212198 | 08-Jun-17 |
CX212198 | 09-Jun-17 |
N0002488 | 08-Jun-17 |
N0002488 | 14-Jun-17 |
P0000740 | 12-Jun-17 |
P0000740 | 16-Jun-17 |
R0000249 | 06-Jul-17 |
R0000249 | 25-Jul-17 |
May be this
SET DateFormat='DD-MMM-YY';
Table:
LOAD * INLINE [
Id, Auth Date
C0018337, 19-Jul-17
C0018337, 27-Jul-17
CX211795, 02-Jun-17
CX211795, 02-Jun-17
CX212198, 08-Jun-17
CX212198, 09-Jun-17
N0002488, 08-Jun-17
N0002488, 14-Jun-17
P0000740, 12-Jun-17
P0000740, 16-Jun-17
R0000249, 06-Jul-17
R0000249, 25-Jul-17
];
FinalTable:
LOAD Id,
[Auth Date],
If(Id = Previous(Id), 'Duplicate') as Flag
Resident Table
Order By Id, [Auth Date] desc;
DROP Table Table;
This is good advice from Sunny. you can also do a load distinct to just eliminate the duplicates where they have the same dates and Id.
Hi Simon,
Another way:
Data:
LOAD * Inline [
Id, Auth Date
C0018337, 19-Jul-17
C0018337, 27-Jul-17
CX211795, 02-Jun-17
CX211795, 02-Jun-17
CX212198, 08-Jun-17
CX212198, 09-Jun-17
N0002488, 08-Jun-17
N0002488, 14-Jun-17
P0000740, 12-Jun-17
P0000740, 16-Jun-17
R0000249, 06-Jul-17
R0000249, 25-Jul-17
];
Left Join(Data)
LOAD
Id,
min([Auth Date],2) as [Auth Date],
'OK' as Flag
Resident Data Group by Id;
gives:
Id | Auth Date | Flag |
---|---|---|
C0018337 | 27-Jul-17 | OK |
C0018337 | 19-Jul-17 | |
CX211795 | 02-Jun-17 | |
CX212198 | 09-Jun-17 | OK |
CX212198 | 08-Jun-17 | |
N0002488 | 14-Jun-17 | OK |
N0002488 | 08-Jun-17 | |
P0000740 | 16-Jun-17 | OK |
P0000740 | 12-Jun-17 | |
R0000249 | 25-Jul-17 | OK |
R0000249 | 06-Jul-17 |
cheers
Andrew
Simon can you mark the answer as correct and close thread if you are satisfied with the results?
I will once I've had chance to have a play. I am a complete novice so I'm a bit slow at the moment. What I'm not sure about the responses is that the table will contain over 20,000 records (of which only about 12 will be duplicates) and the the two fields in question will not be sorted so I'm not sure if this will be an issue with the above solutions.
Thank you Sunny. Apologies for not making myself clear. I won't be typing the data in, it will be part of a non sorted data table/file.
Thank you Sunny. Apologies for not making myself clear. I won't be typing the data in, it will be part of a non sorted data table/file.
We know that.... the above was just to show how it can be done.... in your case, you will do something like this
SET DateFormat='DD-MMM-YY';
Table:
LOAD *
FROM .....;
FinalTable:
LOAD Id,
[Auth Date],
If(Id = Previous(Id), 'Duplicate') as Flag,
....
Resident Table
Order By Id, [Auth Date] desc;
DROP Table Table;
Red part is where you will load the data from whatever you are loading it from
Thanks Sunny, I'm still having issues. I've tried copying your syntax but I'm obviously doing something wrong.
The table is already loaded (called REP). Can I ask you to spell out the red text in idiot terms for me please. As soon as I type FROM [REP] it doesn't like it. Sorry to be a pain