Qlik Community

New to QlikView

Discussion board where members can get started with QlikView.

Announcements
Qlik Analytics Tour 2020 Online. Begins August 10th. Register Today
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Creator II
Creator II

How to remove duplicate records while loading a table based on a column

Hi  Guys, 

Problem : I need to load a table from an excel but need to remove duplicate records based on a column for same Order_Numbers. For ex:

Below image shows the sample records in the Orders sheet that has unique Order_Number

Orders.PNG

Below image is of the MappingSheet in which I want to remove the duplicate records for each Order_Number based on the column "IssueType". So, MappingSheet can have multiple records for one Order_Number but what iIwant is to load only those Order_Numbers which have different IssueType.

OrderDuplication.PNG

 So, only highlighted records should be loaded into Qlik. Note that IssueType can be same for different orders as is the case for Order_Number 1 and 2. I have to check for the duplicate IssueType for each Order_Number and if there are more than one records with same IssueType for a single Order_Number then I have to pick only one with the oldest date.

Hope, I have made the requirement clear enough but if not please let me know. I need to come up with a solution real quick for this one so any suggestion would be of great help.

Thanks,

Pranav

 

 

 

13 Replies
Highlighted
Partner
Partner

Re: How to remove duplicate records while loading a table based on a column

Hey Pranaview,

Ok, so first I think you need to make a stronger %Key, like:

Order_Number &' | '& IssueType as %Key

After doing that, you can try using count... Like...

If( count(%Key)>1 , max (Date_Auto) , Date_Auto)

I think it should work.. maybe not the best solution, but it's the first thing that came into my mind ...

Note that i didn't test it, so let me know if it works..

Hope that I've helped you.

Regards, Fuku.
Highlighted
Creator II
Creator II

Re: How to remove duplicate records while loading a table based on a column

Hi Fuku,
Thanks a lot for the suggestion.
I have a question. Where should i add the If condition? During the load? or in the charts? cos what i want is to just not load the duplicate records into Qlik.
Highlighted
Partner
Partner

Re: How to remove duplicate records while loading a table based on a column

Try with the below

T:
LOAD [Order Number],
Team,
[Issue Type],
Comment,
Date
FROM
[z:\xxx.xlsx]
(ooxml, embedded labels, table is Sheet2);

T1:
Load *
Where Flag='Yes';

Load *,
if([Order Number]&[Issue Type]=Previous([Order Number]&[Issue Type]),'No','Yes') as Flag
Resident T order by [Order Number] asc,[Issue Type] asc;

DROP Table T;

Highlighted
Partner
Partner

Re: How to remove duplicate records while loading a table based on a column

Well, if you don't want to load the duplicate, you can add it to the script, but you'll need to "Group by" the other fields...
Highlighted
Partner
Partner

Re: How to remove duplicate records while loading a table based on a column

There are many ways to achieve this. You can construct a composit key and use firstsortedvalue on all the fields etc perhaps using an inner Join by RowNo would be faster. Be aware that you are using a transparent rule regarding which rows you keep and discard, in the example below we take the first value sorted by date. 

source:
load
*,
RowNo() as RowNo
;
load * inline [
OrderNo, IssueType, DateId
1,1,1
1,1,2
1,2,2
2,1,3
2,5,4
3,6,5
4,7,6
4,7,8
5,9,8
5,10,9
5,10,10
];

inner join (source)

Load
FirstSortedValue(RowNo,DateId) as RowNo
Resident source group by OrderNo, IssueType;

Highlighted
Partner
Partner

Re: How to remove duplicate records while loading a table based on a column

You can do something like this:

Load    *,  

             Max(Date_Auto)               as Date_Auto,

             count( DISTINCT  Order_Number &' | '& IssueType)                      as %Key,

 

From []  Group by ;

You need to put all other fields in Group , except by Key fields and Date;

      

Hope it helps Pranav.

Fuku.

Highlighted
Creator II
Creator II

Re: How to remove duplicate records while loading a table based on a column

Hi Fuku,
So, I tried doing what has suggested in your first post. I created a composite key of Order_Number and IssueType i.e. %Key while loading the data.
Then I tried to use firstsortedvalue() function to get unique records based on the composite key as below
Load %Key,
Date(Max(Date_Auto),'MM/DD/YYYY') as Date_Auto,
FirstSortedValue(Distinct Order_Number,-Date_Auto) as Order_Number,
FirstSortedValue(Distinct Team,-Date_Auto) as Team,
FirstSortedValue(Distinct IssueType,-Date_Auto) as IssueType,
FirstSortedValue(Distinct Comment,-Date_Auto) as Comment
Resident Delivery Group by %Key;

But in the Table view, i am just getting value in %Key column, rest are all null.

Pranav
Highlighted
Creator II
Creator II

Re: How to remove duplicate records while loading a table based on a column

Hi Jonas,
Thanks for the response.
I tried the first approach i.e. creating a composite key of Order Number and IssueType and then using firstsortedvalue() by Date_Auto field and Group By on the composite key but in the result I am not getting any values except in the composite key column.

I tried the script that you have pasted here but it is not showing any data in the Table View.
I am really new to QV. Hence the struggle.

Thanks,
Pranav
Highlighted
Partner
Partner

Re: How to remove duplicate records while loading a table based on a column

Can you post the script that you have tried here?