Qlik Community

QlikView Scripting

Discussion Board for collaboration on QlikView Scripting.

Announcements
QlikView Fans! We’d love to hear from you.
Share your QlikView feedback with the product team… Click here to participate in our 5-minute survey.
Rules, plus terms and conditions, can be found here.
hkg_qlik
Contributor II

Remove Duplicates

Hi,

I have a table which holds two key fields: MemberID and CustomerID.

Before moving forward I want to check for any duplicates in table with a Key combination of MemberID + CustomerID. (Both reside in the same table)

Any thoughts?

Thanks,

H

1 Solution

Accepted Solutions
jason_michaelid
Honored Contributor II

Re: Remove Duplicates

Try aggregating the resident table and joining to itself:

Data:

LOAD

   MemberID,

   CustomerID,

   1 AS Count

From.....;

JOIN (Data)

LOAD

   MemberID,

   CustomerID,

   Sum(Count) AS Freq

Resident Data

Group by MemberID,CustomerID;

That will show you the frequency of each row.

Hope this helps,

Jason

13 Replies
jason_michaelid
Honored Contributor II

Re: Remove Duplicates

Try aggregating the resident table and joining to itself:

Data:

LOAD

   MemberID,

   CustomerID,

   1 AS Count

From.....;

JOIN (Data)

LOAD

   MemberID,

   CustomerID,

   Sum(Count) AS Freq

Resident Data

Group by MemberID,CustomerID;

That will show you the frequency of each row.

Hope this helps,

Jason

hkg_qlik
Contributor II

Re: Remove Duplicates

I want to flag the duplicates and remove them from the table.

jason_michaelid
Honored Contributor II

Re: Remove Duplicates

I assume you want to leave one of each duplicate record behind? Will all other fields also be identical? If not, how do you decide which record to keep?

hkg_qlik
Contributor II

Re: Remove Duplicates

Yes all other fields would be identical. So I can keep any one record of the two or more.

jason_michaelid
Honored Contributor II

Re: Remove Duplicates

Then maybe just LOAD DISTINCT in the first place....

hkg_qlik
Contributor II

Re: Remove Duplicates

So something like this:

Load Distinct MemberID&' '&CustomerID as Key,

MemberID,

CustomerID,

.

.

.

From

ABC.qvd

I want to check when the combination of MemberID and CustomerID is repeating in the table.

manojkvrajan
Contributor III

Re: Remove Duplicates

Script mentioned  below brings only the data that has the combination > 1. I hope you can take it forward from here to avoid data duplication using the timestamp or anything specific to your requirements. Does it help?

Temp:


LOAD DISTINCT MemberID, CustomerID,
COUNT(MemberID&'_'&CustomerID) AS Count
FROM
(SourceFile.xls )
GROUP BY MemberID, CustomerID;

Key:


LOAD MemberID&'_'&CustomerID AS Key
RESIDENT Temp
WHERE Count >1;


DROP TABLE Temp;

ActualTable:

LOAD *
FROM
(SourceFile)
WHERE EXISTS (Key,MemberID&'_'&CustomerID);

DROP TABLE Key;

jason_michaelid
Honored Contributor II

Re: Remove Duplicates

I meant simply:

LOAD DISTINCT * FROM Data.qvd(qvd);

Jason

hkg_qlik
Contributor II

Re: Remove Duplicates

Thank you all for your inputs.

Community Browser