Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
has35526
Contributor III
Contributor III

Remove Duplicate row in set analysi

Hi Everyone,

I need to remove duplicate records in set analysis. How can I remove this duplicate record in set analysis. I tried using distinct function in filename sand distinct function is not working. For example I need to remove RowNo 2431 as it is identical to RowNo 2332

has35526_1-1682895008848.png

 

 

 

Labels (2)
1 Solution

Accepted Solutions
BrunPierre
Partner - Master
Partner - Master

Hi, @has35526 perhaps as below.

Tmp:
LOAD Hash128("File Name", "Processing Date", "Mailing Date", <List rest of fields>) as Hash,
*
FROM DataSource;

Data:
LOAD *
Where Duplicates# = 0;

LOAD Hash,
Count(Hash)-1 as Duplicates#
Resident Tmp
Group By Hash;

Left Join(Data)
LOAD *
Resident Tmp;

DROP Fields Hash, Duplicates# From Data;
DROP Table Tmp;

EXIT SCRIPT;

View solution in original post

6 Replies
Lisa_P
Employee
Employee

It is because the row number is there and it is different that it is showing both

has35526
Contributor III
Contributor III
Author

Yes, I need to remove one of the record. The sum function is making the count double. so for example if I use sum of Manifested package I am getting a value of 2 instead of 1

Lisa_P
Employee
Employee

When you load the data, where are the duplicates coming from ?

has35526
Contributor III
Contributor III
Author

The duplicates are coming from the data.

 

Is there a way to remove them in Qlik rather than the data it self?

Lisa_P
Employee
Employee

Is there something else in the raw data coming in that is causing it to come in on different rows, if not you could try a load distinct ...

BrunPierre
Partner - Master
Partner - Master

Hi, @has35526 perhaps as below.

Tmp:
LOAD Hash128("File Name", "Processing Date", "Mailing Date", <List rest of fields>) as Hash,
*
FROM DataSource;

Data:
LOAD *
Where Duplicates# = 0;

LOAD Hash,
Count(Hash)-1 as Duplicates#
Resident Tmp
Group By Hash;

Left Join(Data)
LOAD *
Resident Tmp;

DROP Fields Hash, Duplicates# From Data;
DROP Table Tmp;

EXIT SCRIPT;