Solved: Remove duplicates in load based on one column - Qlik Community

cas-ictp · ‎2019-04-29

Hello,

I have imported data in a Qlik table with SQL select. After the import I generated a unique ID, because I need to remove duplicates based on that ID.

I have the following variables; itemNo, shortItemNo, lotNo, location, date, time, ID. I need to remove all duplicate ID's (and according rows). I searched the Qlik forum, but could not find examples that work for me.

I hope someone can help me. Thanks!

zhadrakas · ‎2019-04-30

please tell me which line do you want to Keep.

sample:

imagine you just want to Keep the first line of ID "AAA011lQ11901613"

you could use:

Load distinct id,
firstvalue(shortitemNo) as shortitemNo,
firstvalue(itemNo) as itemNo,
firstvalue(lotNo) as lotNo

...

Group by id

;

thats just an example. if this is not what you want to achive, please describe your desired Output based on the example above.

regards

tim

View solution in original post

zhadrakas · ‎2019-04-29

after you've created your unique id, did you try something like this:

Final_Table:

Noconcatenate

Load Distinct Unique_ID, other_fields

resident your_table

;

drop table your_table;

regards

tim

cas-ictp · ‎2019-04-29

Hello Tim,

Thank you for you help.

Yes I tried, without result. Unfortunately all duplicates ID are still there.

zhadrakas · ‎2019-04-29

can you share a sample dataset?

cas-ictp · ‎2019-04-30

See attachment.

miskinmaz · ‎2019-04-30

Hi,
You can try the below code:

Load distinct id,
field1,
field2,
and so on...,
sum(1) as test_col
resident table
group by
id,
field1,
field2

Try to keep only necessary fields.

cas-ictp · ‎2019-04-30

Hi,

I did not work. If I only use column ID it works, but as soon as I add one or more other variables, duplicates are not removed.

zhadrakas · ‎2019-04-30

please tell me which line do you want to Keep.

sample:

imagine you just want to Keep the first line of ID "AAA011lQ11901613"

you could use:

Load distinct id,
firstvalue(shortitemNo) as shortitemNo,
firstvalue(itemNo) as itemNo,
firstvalue(lotNo) as lotNo

...

Group by id

;

thats just an example. if this is not what you want to achive, please describe your desired Output based on the example above.

regards

tim

jonathandienst · ‎2019-04-30

LOAD Distinct will remove duplicate identical records, but will not remove duplicate values in one field if they differ in others. To exclude duplicate of a single field, you could use this pattern:

LOAD *
Where Not(Exists(KeyField));
SQL SELECT KeyField,
...
FROM ...;

This will grab the first instance of KeyField, and there will be no indication of excluded records, so make sure this is the correct way to handle your data.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein

cas-ictp · ‎2019-04-30

Hi Tim,

It works, thank you for your help!

This is what I want to achieve.

Regards,

Cas