Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
To remove duplicates in following tables.
For Example,
1) We have 10 records, the 5 and 6 records having same data except modified date column.
2) Here, we need to get the max modified date from 5 and 6 and remove the min modified records.
3) Suppose the 5th record is max we need to show 9 records from table except 6th record.
Sample Data:
MTR_ID | Name | Modified Date timestamp |
---|---|---|
ABC01 | Lawrance | 5/27/2017 12:00:00 AM |
ABC02 | Bose | 5/27/2017 1:00:00 AM |
ABC03 | Prathap | 5/27/2017 2:00:00 AM |
ABC04 | Vengadesh | 5/27/2017 3:00:00 AM |
ABC05 | Ranjit | 5/27/2017 4:00:00 AM |
ABC06 | Sathish | 5/27/2017 5:00:00 AM |
ABC07 | Sai Prasad | 5/27/2017 6:00:00 AM |
ABC07 | Sai Prasad | 5/27/2017 6:30:00 AM |
ABC08 | Muthu | 5/27/2017 7:00:00 AM |
ABC09 | Irshad | 5/27/2017 8:00:00 AM |
ABC10 | Naveen | 5/27/2017 9:00:00 AM |
Expected Output:
MTR_ID | Name | Modified Date timestamp |
---|---|---|
ABC01 | Lawrance | 5/27/2017 12:00:00 AM |
ABC02 | Bose | 5/27/2017 1:00:00 AM |
ABC03 | Prathap | 5/27/2017 2:00:00 AM |
ABC04 | Vengadesh | 5/27/2017 3:00:00 AM |
ABC05 | Ranjit | 5/27/2017 4:00:00 AM |
ABC06 | Sathish | 5/27/2017 5:00:00 AM |
ABC07 | Sai Prasad | 5/27/2017 6:30:00 AM |
ABC08 | Muthu | 5/27/2017 7:00:00 AM |
ABC09 | Irshad | 5/27/2017 8:00:00 AM |
ABC10 | Naveen | 5/27/2017 9:00:00 AM |
Thanks,
Lawrance A
You can use, Source table as well like below
Load MTR_ID, Name, MAX([Modified Date timestamp]) AS [Modified Date timestamp] GROUP BY MTR_ID, Name;
LOAD MTR_ID, Name, [Modified Date timestamp] From <Source Path>;
Using max() returns the output that you want:
LOAD MTR_ID, Name, MAX([Modified Date timestamp]) AS [Modified Date timestamp]
INLINE [
MTR_ID, Name, Modified Date timestamp
ABC01, Lawrance, 5/27/2017 12:00:00 AM
ABC02, Bose, 5/27/2017 1:00:00 AM
ABC03, Prathap, 5/27/2017 2:00:00 AM
ABC04, Vengadesh, 5/27/2017 3:00:00 AM
ABC05, Ranjit, 5/27/2017 4:00:00 AM
ABC06, Sathish, 5/27/2017 5:00:00 AM
ABC07, Sai Prasad, 5/27/2017 6:00:00 AM
ABC07, Sai Prasad, 5/27/2017 6:30:00 AM
ABC08, Muthu, 5/27/2017 7:00:00 AM
ABC09, Irshad, 5/27/2017 8:00:00 AM
ABC10, Naveen, 5/27/2017 9:00:00 AM
]
GROUP BY MTR_ID, Name;
Hi,
TMP:
LOAD *, IF(Previous(MTR_ID & Name)<>(MTR_ID & Name),'',Previous([Modified Date timestamp])) AS PRM
;
LOAD *
INLINE [
MTR_ID, Name, Modified Date timestamp
ABC01, Lawrance, 5/27/2017 12:00:00 AM
ABC02, Bose, 5/27/2017 1:00:00 AM
ABC03, Prathap, 5/27/2017 2:00:00 AM
ABC04, Vengadesh, 5/27/2017 3:00:00 AM
ABC05, Ranjit, 5/27/2017 4:00:00 AM
ABC06, Sathish, 5/27/2017 5:00:00 AM
ABC07, Sai Prasad, 5/27/2017 6:00:00 AM
ABC07, Sai Prasad, 5/27/2017 6:30:00 AM
ABC08, Muthu, 5/27/2017 7:00:00 AM
ABC09, Irshad, 5/27/2017 8:00:00 AM
ABC10, Naveen, 5/27/2017 9:00:00 AM
];
[PARAM]:LOAD Concat(CHR(39) & PRM & CHR(39),',') AS PRMt Resident TMP;
LET v_PRM = Peek('PRMt');
NoConcatenate
SUMARY:
LOAD * Resident TMP Where NOT wildMatch([Modified Date timestamp], $(v_PRM));
DROP Field PRM;
DROP Table PARAM,TMP;
EXIT Script;
May be in front end like this -
Data:
LOAD MTR_ID,
Name,
Timestamp([Modified Date timestamp],'DD/MM/YYYY hh:mm:ss') as TimeStamp
FROM Table;
inner join(Data)
LOAD MTR_ID,
Name,
Timestamp(max([Modified Date timestamp]),'DD/MM/YYYY hh:mm:ss') as TimeStamp
Resident Data
Group by
MTR_ID,
Name;
Thanks for your response.
You mentioned using Inline, but I have source in CSV file.
Could you please explain me how to achieve this.
Thanks,
Lawrance A
You can use, Source table as well like below
Load MTR_ID, Name, MAX([Modified Date timestamp]) AS [Modified Date timestamp] GROUP BY MTR_ID, Name;
LOAD MTR_ID, Name, [Modified Date timestamp] From <Source Path>;