Qlik Community

New to QlikView

Discussion board where members can get started with QlikView.

Announcements
IMPORTANT: Upcoming LEF Database Maintenance, Oct. 3rd - SEE DETAILS
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Creator II
Creator II

How to remove duplicate records while loading a table based on a column

Hi  Guys, 

Problem : I need to load a table from an excel but need to remove duplicate records based on a column for same Order_Numbers. For ex:

Below image shows the sample records in the Orders sheet that has unique Order_Number

Orders.PNG

Below image is of the MappingSheet in which I want to remove the duplicate records for each Order_Number based on the column "IssueType". So, MappingSheet can have multiple records for one Order_Number but what iIwant is to load only those Order_Numbers which have different IssueType.

OrderDuplication.PNG

 So, only highlighted records should be loaded into Qlik. Note that IssueType can be same for different orders as is the case for Order_Number 1 and 2. I have to check for the duplicate IssueType for each Order_Number and if there are more than one records with same IssueType for a single Order_Number then I have to pick only one with the oldest date.

Hope, I have made the requirement clear enough but if not please let me know. I need to come up with a solution real quick for this one so any suggestion would be of great help.

Thanks,

Pranav

 

 

 

13 Replies
Highlighted
Partner
Partner

Hmm, try the last suggestion that i've made...

Without a composite the first Key, only :
max(Date_Auto) as Date_Auto,
count( DISTINCT Order_Number &' | '& IssueType) as %Key

And group by all other fields.
I think it should work... You don't even need to use FirstSortedValue() in this case..

Regards, Fuku.
Highlighted
Partner
Partner

Hello again, Pranav.

Did you find a solution? Or did our suggestions worked?

Let us know, please!

Best Regards, Fuku.
Highlighted
Creator II
Creator II

Hi Fuku,

I was not able to make it work with your latest suggestion. Although I made it work with a slight tweak in the solution provided by @madhudinesh. But Yesterday, the client again changed(surprise surprise) their mind and now they want the duplicate values to be displayed as well so that they can catch the issue in the source and directly fix it there.But anyway thanks a lot to you and everyone else for pitching in with valuable suggestions. I will close this thread now.

Pranav
Highlighted
Partner
Partner

Try this:

TempData:
Load * Inline
[
Order_Number,Team,IssueType,Comment,Date_Auto
1,Team1,Issue1,Oldest Logged,43400
1,Team2,Issue1,Comment2,43401
1,Team3,Issue2,Comment3,43401
2,Team4,Issue1,Comment4,43402
2,Team5,Issue1,Comment5,43403
2,Team6,Issue4,Comment6,43404
]
;
MaxDate:
NoConcatenate Load
    Order_Number,
    IssueType,
    Min(Date_Auto) as Date_Auto
Resident
    TempData
Group By
    Order_Number,
    IssueType
;
Left Join(MaxDate)
Load * Resident TempData;

Drop Table TempData;