Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Have multiple records with same field value. Only want one record per distinct field value.

Hi community,  I have a field say social security number that has mulitple records in the table due to different employee numbers or name changes.  I only want to load one record, I don't care which, per social security number.  Any ideas how to do that?

1 Solution

Accepted Solutions
manojkvrajan
Luminary
Luminary

Brandon, Do you have an 'Update Time' for these records? Please follow this logic if Yes and let me know if this works.

Temp:

LOAD DISTINCT SSN,

max ([Updated Time]) AS UpdatedTime

FROM xyz.qvd

GROUP BY SSN;

Key:

LOAD *,

SSN&'_'&UpdatedTime AS Key

RESIDENT Temp;

DROP TABLE Temp;

ActualTable:

LOAD *,

SSN&'_'&UpdatedTime AS Key

WHERE EXISTS (Key, SSN&'_'UpdatedTime);

DROP TABLE Key;

DROP FIELD Key;

View solution in original post

2 Replies
manojkvrajan
Luminary
Luminary

Brandon, Do you have an 'Update Time' for these records? Please follow this logic if Yes and let me know if this works.

Temp:

LOAD DISTINCT SSN,

max ([Updated Time]) AS UpdatedTime

FROM xyz.qvd

GROUP BY SSN;

Key:

LOAD *,

SSN&'_'&UpdatedTime AS Key

RESIDENT Temp;

DROP TABLE Temp;

ActualTable:

LOAD *,

SSN&'_'&UpdatedTime AS Key

WHERE EXISTS (Key, SSN&'_'UpdatedTime);

DROP TABLE Key;

DROP FIELD Key;

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Here's one approach.

data:

LOAD *, RecNo() as RecId INLINE [

SSN, Name

1234, Rob

1234, Rob W

4567, Jane

7890, Joe

7890, Joey

]

;

Inner JOIN (data)

LOAD min(RecId) as RecId

RESIDENT data

GROUP By SSN

;