Skip to main content
Announcements
Qlik Introduces a New Era of Visualization! READ ALL ABOUT IT
cancel
Showing results for 
Search instead for 
Did you mean: 
Maanasa
Contributor II
Contributor II

Duplicate rows in qlik

Hi, Here is my sample table, how do I eliminate duplicate rows where A--> B and B--> A , first two rows in below example needs to be considered as one single row.

Load * inline [

testID, col1, col2

1, A, B

1, B, A

1, C, A

2, D, E

3, F, H

];

 

thanks in advance!

Labels (2)
1 Solution

Accepted Solutions
Vegar
MVP
MVP

If you want A|B  to be considered equal to B|A then you should consider making the rows identical in the load script.

Maybe like this:

Load testID , rangeminstring(col1,col2) as col1,rangemaxstring(col1,col2)  as col2

inline [

testID, col1, col2

1, A, B

1, B, A

1, C, A

2, D, E

3, F, H

];

 

View solution in original post

8 Replies
MeehyeOh
Partner - Creator
Partner - Creator

Hi, @Maanasa 

You mean that the first two rows in the example should be considered as one single row because "A" is the same as "B"?

Then which one of the two lines should be left?

Is there a standard you have?

Aasir
Creator III
Creator III

compare col1 with col2 and create a temp table with ordering.

create a final table with temp table data.

Vegar
MVP
MVP

If you want A|B  to be considered equal to B|A then you should consider making the rows identical in the load script.

Maybe like this:

Load testID , rangeminstring(col1,col2) as col1,rangemaxstring(col1,col2)  as col2

inline [

testID, col1, col2

1, A, B

1, B, A

1, C, A

2, D, E

3, F, H

];

 

marksouzacosta
Partner - Creator III
Partner - Creator III

Quite good exercise @Maanasa 

So, to eliminate the first two records you can do the following:

InputData:
LOAD
  testID, 
  col1, 
  col2,
  Hash128(col1, col2) AS Hash12,
  Hash128(col2, col1) AS Hash21
;
Load * inline [
testID, col1, col2
1, A, B
1, B, A
1, C, A
2, D, E
3, F, H
];

FinalResult:
LOAD
  testID, 
  col1, 
  col2
RESIDENT
	InputData
WHERE
	Not Exists(Hash12,Hash21)
;

DROP TABLE InputData;

 

This will be your end result:

marksouzacosta_0-1717130803406.png

 

Read more at Data Voyagers - datavoyagers.net
Lech_Miszkiewicz
Partner Ambassador/MVP
Partner Ambassador/MVP

I like your approach @marksouzacosta - well done!

cheers Lech, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful to the problem.
Rohan
Specialist
Specialist

Hi @Maanasa ,

The only addition to @Vegar 's script is, you will have to do is add Distinct while loading then you eliminate the duplicate row, Refer the attached image : 

Rohan_0-1717140616722.png

 

@marksouzacosta , your code is cool too, but it will completely eliminate the duplicates, where as atleast one of them is to be retrieved in the data. Also, we can simply do a mapping load too to achieve the same result : 

Rohan_1-1717140715977.png

 

 

Regards,

Rohan.

 

Maanasa
Contributor II
Contributor II
Author

Thanks Mark!, I dont want to eliminate first two rows I want to keep one of them. this is deduping exercise. I need either one of the first two row. A-> B or B--> A.

Maanasa
Contributor II
Contributor II
Author

Thank you so much! that actually worked. 🙂