Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
SonOfJeffGoldblum
Contributor III
Contributor III

Remove both records of a duplicate in script load editor qlik sense

Has anyone come across a way to remove both records of a duplicate pair?

For example, in the below table, record 1 & 2 are exact duplicates. Is there a way to exclude these both (1&2) but still load 3-5 in the load statement?

I have tried several ways using peek(), recno(), rowno(), group by, order by, etc as aggr and rank cannot be used. I have gotten down to one of the duplicate pair, but for my requirements, we are looking to exclude both pieces.

 col 1col 2col 3col 4
1abcghirstxyz
2abcghirstxyz
3defjklrstxyz
4abcghiuvwxyz
5defjklrstuvw

 

thank you so much!

Everything not saved will be lost
— Nintendo Quit screen message
Labels (3)
2 Solutions

Accepted Solutions
robert99
Specialist III
Specialist III

Have you tried where distinct

Or maybe this approach might work

load a TempTable and set up a new field (col1 & col2 & col3 & col4)  as TempField

the do a

join(TempTable)

load

TempField,

count (TempField) as CountTemp

resident TempTable

group by (TempField)

then do

Table:

noconcatenate (etc)

load

*

resident TempTable

where CountTemp = 1 ;

Drop Fields CountTemp, TempField ;

Drop Table TempTable ;

 

 

View solution in original post

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

How about:

Data:
LOAD F1 as Key,
    
[col 1],
    
[col 2],
    
[col 3],
    
[col 4],
    
Hash128([col 1],[col 2],[col 3],[col 4]) as Hash
FROM
[https://community.qlik.com/t5/New-to-Qlik-Sense/Remove-both-records-of-a-duplicate-in-script-load-ed...]
(
html, utf8, UserAgent is 'Mozilla/5.0', embedded labels, table is @1);

Inner Join(Data)
Hash:
LOAD *
Where HashCount = 1
;
LOAD
Hash,
Count(Hash) as HashCount
Resident Data
Group By Hash

;

If, as you indicated in other responses, you already have a combined field to test, you can skip the Hash128() and use your combined field instead. 

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

View solution in original post

5 Replies
robert99
Specialist III
Specialist III

Have you tried where distinct

Or maybe this approach might work

load a TempTable and set up a new field (col1 & col2 & col3 & col4)  as TempField

the do a

join(TempTable)

load

TempField,

count (TempField) as CountTemp

resident TempTable

group by (TempField)

then do

Table:

noconcatenate (etc)

load

*

resident TempTable

where CountTemp = 1 ;

Drop Fields CountTemp, TempField ;

Drop Table TempTable ;

 

 

SonOfJeffGoldblum
Contributor III
Contributor III
Author

Good idea,

 I do have a field already as a combination of those columns in my table.

If I have that already in my main table, how might I then reference that instead of a temp table? I think I see where your code was going but cant exactly get it any further than the load that brings in duplicates.

thank you

Everything not saved will be lost
— Nintendo Quit screen message
robert99
Specialist III
Specialist III

You just need to do a 4 step process

1. Load the data and make a combined field (fields 1+2+3+4)

2. Do count and join with the above table

3. Then create the table excluding rows with 2 or more identical records (making sure its not merged with the step 1 table)

4. Drop initial loaded table and any fields not needed

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

How about:

Data:
LOAD F1 as Key,
    
[col 1],
    
[col 2],
    
[col 3],
    
[col 4],
    
Hash128([col 1],[col 2],[col 3],[col 4]) as Hash
FROM
[https://community.qlik.com/t5/New-to-Qlik-Sense/Remove-both-records-of-a-duplicate-in-script-load-ed...]
(
html, utf8, UserAgent is 'Mozilla/5.0', embedded labels, table is @1);

Inner Join(Data)
Hash:
LOAD *
Where HashCount = 1
;
LOAD
Hash,
Count(Hash) as HashCount
Resident Data
Group By Hash

;

If, as you indicated in other responses, you already have a combined field to test, you can skip the Hash128() and use your combined field instead. 

-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com

SonOfJeffGoldblum
Contributor III
Contributor III
Author

Thank you @robert99@rwunderlich 

This worked perfectly. I was looking at fields from another table when validating yesterday that held the same data which prompted me asking again for clarification. Great solution and thank you again for your assistance!

Cheers!

Everything not saved will be lost
— Nintendo Quit screen message