Problem : I need to load a table from an excel but need to remove duplicate records based on a column for same Order_Numbers. For ex:
Below image shows the sample records in the Orders sheet that has unique Order_Number
Below image is of the MappingSheet in which I want to remove the duplicate records for each Order_Number based on the column "IssueType". So, MappingSheet can have multiple records for one Order_Number but what iIwant is to load only those Order_Numbers which have different IssueType.
So, only highlighted records should be loaded into Qlik. Note that IssueType can be same for different orders as is the case for Order_Number 1 and 2. I have to check for the duplicate IssueType for each Order_Number and if there are more than one records with same IssueType for a single Order_Number then I have to pick only one with the oldest date.
Hope, I have made the requirement clear enough but if not please let me know. I need to come up with a solution real quick for this one so any suggestion would be of great help.
Try with the below
LOAD [Order Number],
(ooxml, embedded labels, table is Sheet2);
if([Order Number]&[Issue Type]=Previous([Order Number]&[Issue Type]),'No','Yes') as Flag
Resident T order by [Order Number] asc,[Issue Type] asc;
DROP Table T;
There are many ways to achieve this. You can construct a composit key and use firstsortedvalue on all the fields etc perhaps using an inner Join by RowNo would be faster. Be aware that you are using a transparent rule regarding which rows you keep and discard, in the example below we take the first value sorted by date.
RowNo() as RowNo
load * inline [
OrderNo, IssueType, DateId
inner join (source)
FirstSortedValue(RowNo,DateId) as RowNo
Resident source group by OrderNo, IssueType;
You can do something like this:
Max(Date_Auto) as Date_Auto,
count( DISTINCT Order_Number &' | '& IssueType) as %Key,
From  Group by ;
You need to put all other fields in Group , except by Key fields and Date;
Hope it helps Pranav.