Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Has anyone come across a way to remove both records of a duplicate pair?
For example, in the below table, record 1 & 2 are exact duplicates. Is there a way to exclude these both (1&2) but still load 3-5 in the load statement?
I have tried several ways using peek(), recno(), rowno(), group by, order by, etc as aggr and rank cannot be used. I have gotten down to one of the duplicate pair, but for my requirements, we are looking to exclude both pieces.
col 1 | col 2 | col 3 | col 4 | |
1 | abc | ghi | rst | xyz |
2 | abc | ghi | rst | xyz |
3 | def | jkl | rst | xyz |
4 | abc | ghi | uvw | xyz |
5 | def | jkl | rst | uvw |
thank you so much!
Have you tried where distinct
Or maybe this approach might work
load a TempTable and set up a new field (col1 & col2 & col3 & col4) as TempField
the do a
join(TempTable)
load
TempField,
count (TempField) as CountTemp
resident TempTable
group by (TempField)
then do
Table:
noconcatenate (etc)
load
*
resident TempTable
where CountTemp = 1 ;
Drop Fields CountTemp, TempField ;
Drop Table TempTable ;
How about:
Data:
LOAD F1 as Key,
[col 1],
[col 2],
[col 3],
[col 4],
Hash128([col 1],[col 2],[col 3],[col 4]) as Hash
FROM
[https://community.qlik.com/t5/New-to-Qlik-Sense/Remove-both-records-of-a-duplicate-in-script-load-ed...]
(html, utf8, UserAgent is 'Mozilla/5.0', embedded labels, table is @1);
Inner Join(Data)
Hash:
LOAD *
Where HashCount = 1
;
LOAD
Hash,
Count(Hash) as HashCount
Resident Data
Group By Hash
;
If, as you indicated in other responses, you already have a combined field to test, you can skip the Hash128() and use your combined field instead.
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
Have you tried where distinct
Or maybe this approach might work
load a TempTable and set up a new field (col1 & col2 & col3 & col4) as TempField
the do a
join(TempTable)
load
TempField,
count (TempField) as CountTemp
resident TempTable
group by (TempField)
then do
Table:
noconcatenate (etc)
load
*
resident TempTable
where CountTemp = 1 ;
Drop Fields CountTemp, TempField ;
Drop Table TempTable ;
Good idea,
I do have a field already as a combination of those columns in my table.
If I have that already in my main table, how might I then reference that instead of a temp table? I think I see where your code was going but cant exactly get it any further than the load that brings in duplicates.
thank you
You just need to do a 4 step process
1. Load the data and make a combined field (fields 1+2+3+4)
2. Do count and join with the above table
3. Then create the table excluding rows with 2 or more identical records (making sure its not merged with the step 1 table)
4. Drop initial loaded table and any fields not needed
How about:
Data:
LOAD F1 as Key,
[col 1],
[col 2],
[col 3],
[col 4],
Hash128([col 1],[col 2],[col 3],[col 4]) as Hash
FROM
[https://community.qlik.com/t5/New-to-Qlik-Sense/Remove-both-records-of-a-duplicate-in-script-load-ed...]
(html, utf8, UserAgent is 'Mozilla/5.0', embedded labels, table is @1);
Inner Join(Data)
Hash:
LOAD *
Where HashCount = 1
;
LOAD
Hash,
Count(Hash) as HashCount
Resident Data
Group By Hash
;
If, as you indicated in other responses, you already have a combined field to test, you can skip the Hash128() and use your combined field instead.
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
Thank you @robert99 & @rwunderlich
This worked perfectly. I was looking at fields from another table when validating yesterday that held the same data which prompted me asking again for clarification. Great solution and thank you again for your assistance!
Cheers!