Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I'm trying to figure out how to remove duplicates that aren't truly duplicates. In the attached data set, I only want to return the row with the latest 'Action Date Time' for a given Login Name, Case ID, Alert ID and Action Date combination. I can't use the Load Distinct command because I need to keep the Action Text field, which is unique for each record.
I thought I could:
However, with both the FirstSortedValue and Min functions, Qlik is still returning all index values as shown in the attached image. Any help would be appreciated.
Thanks.
I was able to find an alternate solution by generating record numbers, and then right joining to a table where I took only the max record number for each respective key. Below is the code.
TEMP_DATA:
LOAD distinct
RecNo() as REC_NBR,
"Case ID",
"Alert ID",
"Action Date Time",
"Create Date*",
"Login Name",
"FI Name",
"Create Date Time",
"Action Text",
"Action Date*",
"Close or Pend*",
//Create Key
Date(Floor(Timestamp#("Action Date*", 'MM/DD/YY hh:mm')), 'YYYY/MM/DD') &'|'& "Alert ID" &'|'& "Case ID" &'|'& "Login Name" as KEY
FROM [lib://AttachedFiles/SampleData (2).xlsx]
(ooxml, embedded labels, table is Input)
NoConcatenate
//Take MAX REC_NBR for KEY
[MAX_REC_NBR]:
LOAD
KEY,
max(REC_NBR) as MAX_REC_NBR
RESIDENT TEMP_DATA
Group By KEY;
Left Join (MAX_REC_NBR)
LOAD *,
REC_NBR as MAX_REC_NBR,
KEY as MAX_REC_KEY
RESIDENT TEMP_DATA;
DROP fields REC_NBR,KEY ;
DROP table TEMP_DATA;
exit script;
using your sampledata
LOAD
"Case ID",
"Alert ID",
"Action Date Time",
"Create Date*",
"Login Name",
"FI Name",
"Create Date Time",
"Action Text",
"Action Date*",
"Close or Pend*",
AutoNumber("Case ID"&"Alert ID"&"Action Text","Case ID"&"Alert ID") as Index
FROM [lib://AttachedFiles/SampleData (2).xlsx]
(ooxml, embedded labels, table is Input);
NoConcatenate
Load * Resident temp
Where Index=1;
drop table temp;
exit Script;
Thank you @vinieme12. I believe this gets me 90% there. When doing some quality checks on my larger data set, I noticed an example where I would have expected the index to increment but it didn't, as shown in the image below.
I thought it would be a matter of changing the AutoID argument of the AutoNumber function, but making the tweak shown below didn't fix it. I've been struggling to understand the AutoID argument, so any advice on this is appreciated.
Thanks.
AutoNumber("Action Date*"&LOGIN_NAME&ALERT_ID&ACTION_TEXT, caseId&ALERT_ID&ACTION_DATE) as "Index*"
Remove actiondate from autoid param
AutoNumber("Action Date*"&LOGIN_NAME&ALERT_ID&ACTION_TEXT, caseId&ALERT_ID) as "Index*"
The output is actually the same if I have the AutoId parameter as caseId&ALERT_ID or caseId&ALERT_ID&ACTION_DATE. Both index the 2 line items as a "1".
I was able to find an alternate solution by generating record numbers, and then right joining to a table where I took only the max record number for each respective key. Below is the code.
TEMP_DATA:
LOAD distinct
RecNo() as REC_NBR,
"Case ID",
"Alert ID",
"Action Date Time",
"Create Date*",
"Login Name",
"FI Name",
"Create Date Time",
"Action Text",
"Action Date*",
"Close or Pend*",
//Create Key
Date(Floor(Timestamp#("Action Date*", 'MM/DD/YY hh:mm')), 'YYYY/MM/DD') &'|'& "Alert ID" &'|'& "Case ID" &'|'& "Login Name" as KEY
FROM [lib://AttachedFiles/SampleData (2).xlsx]
(ooxml, embedded labels, table is Input)
NoConcatenate
//Take MAX REC_NBR for KEY
[MAX_REC_NBR]:
LOAD
KEY,
max(REC_NBR) as MAX_REC_NBR
RESIDENT TEMP_DATA
Group By KEY;
Left Join (MAX_REC_NBR)
LOAD *,
REC_NBR as MAX_REC_NBR,
KEY as MAX_REC_KEY
RESIDENT TEMP_DATA;
DROP fields REC_NBR,KEY ;
DROP table TEMP_DATA;
exit script;