Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Have multiple records with same field value. Only want one record per distinct field value.

Hi community,  I have a field say social security number that has mulitple records in the table due to different employee numbers or name changes.  I only want to load one record, I don't care which, per social security number.  Any ideas how to do that?

1 Solution

Accepted Solutions
manojkvrajan
Luminary
Luminary

Brandon, Do you have an 'Update Time' for these records? Please follow this logic if Yes and let me know if this works.

Temp:

LOAD DISTINCT SSN,

max ([Updated Time]) AS UpdatedTime

FROM xyz.qvd

GROUP BY SSN;

Key:

LOAD *,

SSN&'_'&UpdatedTime AS Key

RESIDENT Temp;

DROP TABLE Temp;

ActualTable:

LOAD *,

SSN&'_'&UpdatedTime AS Key

WHERE EXISTS (Key, SSN&'_'UpdatedTime);

DROP TABLE Key;

DROP FIELD Key;

View solution in original post

2 Replies
manojkvrajan
Luminary
Luminary

Brandon, Do you have an 'Update Time' for these records? Please follow this logic if Yes and let me know if this works.

Temp:

LOAD DISTINCT SSN,

max ([Updated Time]) AS UpdatedTime

FROM xyz.qvd

GROUP BY SSN;

Key:

LOAD *,

SSN&'_'&UpdatedTime AS Key

RESIDENT Temp;

DROP TABLE Temp;

ActualTable:

LOAD *,

SSN&'_'&UpdatedTime AS Key

WHERE EXISTS (Key, SSN&'_'UpdatedTime);

DROP TABLE Key;

DROP FIELD Key;

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Here's one approach.

data:

LOAD *, RecNo() as RecId INLINE [

SSN, Name

1234, Rob

1234, Rob W

4567, Jane

7890, Joe

7890, Joey

]

;

Inner JOIN (data)

LOAD min(RecId) as RecId

RESIDENT data

GROUP By SSN

;