Qlik Community

New to Qlik Sense

Discussion board where members can get started with Qlik Sense.

Announcements
QlikWorld 2020: Join us May 11 - 14, 2020 in Phoenix, AZ. Register early and save $400. Learn More
cas-ictp
New Contributor

Remove duplicates in load based on one column

Hello,

I have imported data in a Qlik table with SQL select. After the import I generated a unique ID, because I need to remove duplicates based on that ID.

I have the following variables; itemNo, shortItemNo, lotNo, location, date, time, ID. I need to remove all duplicate ID's (and according rows). I searched the Qlik forum, but could not find examples that work for me. 

I hope someone can help me. Thanks!

1 Solution

Accepted Solutions
zhadrakas
Valued Contributor

Re: Remove duplicates in load based on one column

 

 

please tell me which line do you want to Keep.

sample:sample.png

imagine you just want to Keep the first line of ID "AAA011lQ11901613"

you could use:

Load distinct id,
firstvalue(shortitemNo) as shortitemNo,
firstvalue(itemNo) as itemNo,
firstvalue(lotNo) as lotNo

...

Group by id

;

thats just an example. if this is not what you want to achive, please describe your desired Output based on the example above.

regards

tim

 

View solution in original post

9 Replies
zhadrakas
Valued Contributor

Re: Remove duplicates in load based on one column

after you've created your unique id, did you try something like this:

Final_Table:

Noconcatenate

Load Distinct Unique_ID, other_fields

resident your_table

;

drop table your_table;

regards

tim

cas-ictp
New Contributor

Re: Remove duplicates in load based on one column

Hello Tim,

Thank you for you help.

Yes I tried, without result. Unfortunately all duplicates ID are still there. 

zhadrakas
Valued Contributor

Re: Remove duplicates in load based on one column

can you share a sample dataset?
cas-ictp
New Contributor

Re: Remove duplicates in load based on one column

See attachment. 

Highlighted
miskinmaz
Contributor III

Re: Remove duplicates in load based on one column

Hi,
You can try the below code:

Load distinct id,
field1,
field2,
and so on...,
sum(1) as test_col
resident table
group by
id,
field1,
field2

Try to keep only necessary fields.
cas-ictp
New Contributor

Re: Remove duplicates in load based on one column

Hi,

I did not work. If I only use column ID it works, but as soon as I add one or more other variables, duplicates are not removed. 

zhadrakas
Valued Contributor

Re: Remove duplicates in load based on one column

 

 

please tell me which line do you want to Keep.

sample:sample.png

imagine you just want to Keep the first line of ID "AAA011lQ11901613"

you could use:

Load distinct id,
firstvalue(shortitemNo) as shortitemNo,
firstvalue(itemNo) as itemNo,
firstvalue(lotNo) as lotNo

...

Group by id

;

thats just an example. if this is not what you want to achive, please describe your desired Output based on the example above.

regards

tim

 

View solution in original post

MVP
MVP

Re: Remove duplicates in load based on one column

LOAD Distinct will remove duplicate identical records, but will not remove duplicate values in one field if they differ in others. To exclude duplicate of a single field, you could use this pattern:

LOAD *
Where Not(Exists(KeyField));
SQL SELECT KeyField,
...
FROM ...;

This will grab the first instance of KeyField, and there will be no indication of excluded records, so make sure this is the correct way to handle your data.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
cas-ictp
New Contributor

Re: Remove duplicates in load based on one column

Hi Tim,

It works, thank you for your help! Smiley Happy

This is what I want to achieve.

Regards,

Cas