Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
zach_paz
Contributor III
Contributor III

Removing Duplicates based on Date and Time Stamp

I am trying to load only the most recent records based on date and time stamp.

I have one QVD with the following records:

lead_id

status_name_update

status_date_update

There are multiple status names with a date and time stamp for each lead_id. Example:

lead_id  status_name_update  status_date_update

1234     contacted                   41456.942928241

1234     converted                    41456.078564815

I want only the latest record to be loaded:

lead_id  status_name_update  status_date_update

1234     converted                   41456.942928241

This is the table I am using:

LOAD lead_id,

          status_name_update,

          status_date_update

FROM

(qvd);

I have tried multiple solutions from other QV Community posts but cannot seem to get anything working properly. Please help!

1 Solution

Accepted Solutions
swuehl
MVP
MVP

Maybe like this:

INPUT:

LOAD lead_id,

          status_name_update,

          status_date_update

FROM

(qvd);

CURRENT:

LOAD lead_id,

          FirstSortedValue(status_name_update, -status_date_update) as LastStatus,

          max(status_date_update) as LastUpdate

RESIDENT INPUT GROUP BY lead_id;

DROP TABLE INPUT;

View solution in original post

2 Replies
swuehl
MVP
MVP

Maybe like this:

INPUT:

LOAD lead_id,

          status_name_update,

          status_date_update

FROM

(qvd);

CURRENT:

LOAD lead_id,

          FirstSortedValue(status_name_update, -status_date_update) as LastStatus,

          max(status_date_update) as LastUpdate

RESIDENT INPUT GROUP BY lead_id;

DROP TABLE INPUT;

zach_paz
Contributor III
Contributor III
Author

It works!

Thank you!!