Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a scenario, where I am getting daily files to be loaded, in which I have to show only the latest records based on Name and ID.
For example:
ID Name Action SourceFilename
12 A Immediate May12
12 A Immediate May13
15 B Future May12
15 B Immediate May13
Expected Output:
ID Name Action SourceFilename
12 A Immediate May13
15 B Future May12
15 B Immediate May13
I have different action for name B, so I am expecting 2 records as it is from the data.
thanks for any help on this
Regards
Prashanth
One way would be to load in all of the data with all of the dates. Then do this:
INNER JOIN (MyTable)
LOAD
ID
,Name
,Action
,max(SourceFilename) as SourceFilename
RESIDENT MyTable
;
That's assuming that your SourceFilename field is actually a date. If not, you'd want a SourceFileDate field, and do the max() on it instead.
Hi John,
when i try the same, am getting error loading message as Invalid expression
Add a group by statement and you should be fine:
INNER JOIN (MyTable)
LOAD
ID
,Name
,Action
,max(SourceFilename) as SourceFilename
RESIDENT MyTable
Group By Name, Action;
Yeah I tried the same, it's showing up the max date of ID, what I am expecting is latest should be populated as shown in my example above.
With the missing group by added (sorry), it gives your desired result on your specified input. If this isn't what you want, then you may want to explain the problem in more detail, perhaps by amending the attached file with additional input data, showing the result, and show what you wanted as the result.
MyTable:
LOAD
ID
,Name
,Action
,date(date#(SourceFilename,'MMMYY'),'MMMYY') as SourceFilename
INLINE [
ID,Name,Action,SourceFilename
12,A,Immediate,May12
12,A,Immediate,May13
15,B,Future,May12
15,B,Immediate,May13
];
INNER JOIN (MyTable)
LOAD
ID
,Name
,Action
,max(SourceFilename) as SourceFilename
RESIDENT MyTable
GROUP BY
ID
,Name
,Action
;