Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
has35526
Contributor III
Contributor III

Remove Duplicate row in set analysi

Hi Everyone,

I need to remove duplicate records in set analysis. How can I remove this duplicate record in set analysis. I tried using distinct function in filename sand distinct function is not working. For example I need to remove RowNo 2431 as it is identical to RowNo 2332

has35526_1-1682895008848.png

 

 

 

Labels (2)
1 Solution

Accepted Solutions
BrunPierre
Partner - Master II
Partner - Master II

Hi, @has35526 perhaps as below.

Tmp:
LOAD Hash128("File Name", "Processing Date", "Mailing Date", <List rest of fields>) as Hash,
*
FROM DataSource;

Data:
LOAD *
Where Duplicates# = 0;

LOAD Hash,
Count(Hash)-1 as Duplicates#
Resident Tmp
Group By Hash;

Left Join(Data)
LOAD *
Resident Tmp;

DROP Fields Hash, Duplicates# From Data;
DROP Table Tmp;

EXIT SCRIPT;

View solution in original post

6 Replies
Lisa_P
Employee
Employee

It is because the row number is there and it is different that it is showing both

has35526
Contributor III
Contributor III
Author

Yes, I need to remove one of the record. The sum function is making the count double. so for example if I use sum of Manifested package I am getting a value of 2 instead of 1

Lisa_P
Employee
Employee

When you load the data, where are the duplicates coming from ?

has35526
Contributor III
Contributor III
Author

The duplicates are coming from the data.

 

Is there a way to remove them in Qlik rather than the data it self?

Lisa_P
Employee
Employee

Is there something else in the raw data coming in that is causing it to come in on different rows, if not you could try a load distinct ...

BrunPierre
Partner - Master II
Partner - Master II

Hi, @has35526 perhaps as below.

Tmp:
LOAD Hash128("File Name", "Processing Date", "Mailing Date", <List rest of fields>) as Hash,
*
FROM DataSource;

Data:
LOAD *
Where Duplicates# = 0;

LOAD Hash,
Count(Hash)-1 as Duplicates#
Resident Tmp
Group By Hash;

Left Join(Data)
LOAD *
Resident Tmp;

DROP Fields Hash, Duplicates# From Data;
DROP Table Tmp;

EXIT SCRIPT;