Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a table which holds two key fields: MemberID and CustomerID.
Before moving forward I want to check for any duplicates in table with a Key combination of MemberID + CustomerID. (Both reside in the same table)
Any thoughts?
Thanks,
H
Try aggregating the resident table and joining to itself:
Data:
LOAD
MemberID,
CustomerID,
1 AS Count
From.....;
JOIN (Data)
LOAD
MemberID,
CustomerID,
Sum(Count) AS Freq
Resident Data
Group by MemberID,CustomerID;
That will show you the frequency of each row.
Hope this helps,
Jason
Try aggregating the resident table and joining to itself:
Data:
LOAD
MemberID,
CustomerID,
1 AS Count
From.....;
JOIN (Data)
LOAD
MemberID,
CustomerID,
Sum(Count) AS Freq
Resident Data
Group by MemberID,CustomerID;
That will show you the frequency of each row.
Hope this helps,
Jason
I want to flag the duplicates and remove them from the table.
I assume you want to leave one of each duplicate record behind? Will all other fields also be identical? If not, how do you decide which record to keep?
Yes all other fields would be identical. So I can keep any one record of the two or more.
Then maybe just LOAD DISTINCT in the first place....
So something like this:
Load Distinct MemberID&' '&CustomerID as Key,
MemberID,
CustomerID,
.
.
.
From
ABC.qvd
I want to check when the combination of MemberID and CustomerID is repeating in the table.
Script mentioned below brings only the data that has the combination > 1. I hope you can take it forward from here to avoid data duplication using the timestamp or anything specific to your requirements. Does it help?
Temp:
LOAD DISTINCT MemberID, CustomerID,
COUNT(MemberID&'_'&CustomerID) AS Count
FROM
(SourceFile.xls )
GROUP BY MemberID, CustomerID;
Key:
LOAD MemberID&'_'&CustomerID AS Key
RESIDENT Temp
WHERE Count >1;
DROP TABLE Temp;
ActualTable:
LOAD *
FROM
(SourceFile)
WHERE EXISTS (Key,MemberID&'_'&CustomerID);
DROP TABLE Key;
I meant simply:
LOAD DISTINCT * FROM Data.qvd(qvd);
Jason
Thank you all for your inputs.