
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Remove duplicates in load based on one column
Hello,
I have imported data in a Qlik table with SQL select. After the import I generated a unique ID, because I need to remove duplicates based on that ID.
I have the following variables; itemNo, shortItemNo, lotNo, location, date, time, ID. I need to remove all duplicate ID's (and according rows). I searched the Qlik forum, but could not find examples that work for me.
I hope someone can help me. Thanks!
Accepted Solutions


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
please tell me which line do you want to Keep.
sample:
imagine you just want to Keep the first line of ID "AAA011lQ11901613"
you could use:
Load distinct id,
firstvalue(shortitemNo) as shortitemNo,
firstvalue(itemNo) as itemNo,
firstvalue(lotNo) as lotNo
...
Group by id
;
thats just an example. if this is not what you want to achive, please describe your desired Output based on the example above.
regards
tim


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
after you've created your unique id, did you try something like this:
Final_Table:
Noconcatenate
Load Distinct Unique_ID, other_fields
resident your_table
;
drop table your_table;
regards
tim

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Tim,
Thank you for you help.
Yes I tried, without result. Unfortunately all duplicates ID are still there.


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
See attachment.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can try the below code:
Load distinct id,
field1,
field2,
and so on...,
sum(1) as test_col
resident table
group by
id,
field1,
field2
Try to keep only necessary fields.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I did not work. If I only use column ID it works, but as soon as I add one or more other variables, duplicates are not removed.


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
please tell me which line do you want to Keep.
sample:
imagine you just want to Keep the first line of ID "AAA011lQ11901613"
you could use:
Load distinct id,
firstvalue(shortitemNo) as shortitemNo,
firstvalue(itemNo) as itemNo,
firstvalue(lotNo) as lotNo
...
Group by id
;
thats just an example. if this is not what you want to achive, please describe your desired Output based on the example above.
regards
tim


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
LOAD Distinct will remove duplicate identical records, but will not remove duplicate values in one field if they differ in others. To exclude duplicate of a single field, you could use this pattern:
LOAD *
Where Not(Exists(KeyField));
SQL SELECT KeyField,
...
FROM ...;
This will grab the first instance of KeyField, and there will be no indication of excluded records, so make sure this is the correct way to handle your data.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tim,
It works, thank you for your help!
This is what I want to achieve.
Regards,
Cas
