Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Team,
CallNo | StartDateTime | yDuplicate |
1 | 10/1/2009 10:15:00 AM | |
2 | 10/1/2009 10:15:00 AM | Duplicate |
3 | 10/3/2009 9:00:00 AM | |
4 | 10/3/2009 9:00:00 AM | Duplicate |
5 | 10/5/2009 9:00:00 AM | |
6 | 10/5/2009 9:00:00 AM | Duplicate |
7 | 10/5/2009 9:00:00 AM | Duplicate |
8 | 10/5/2009 10:30:00 AM | |
9 | 10/5/2009 10:30:00 AM | Duplicate |
In the above output i tried to capture Duplicate record if StartDateTime is same for more than one call. I am using Peek() funcion for it ->
If( Peek(StartDateTime) = StartDateTime,'Duplicate')
I am getting the correct result,,,,however i want bit more. As you can see, "Duplicate" is not written when the record encountered first time. I want "Duplicate" written for callno 1,3,5, & 8 also. I tried to use like
If( Peek(StartDateTime) = StartDateTime Or Peek(StartDateTime,1) = StartDateTime,'Duplicate')
But its nor working.
Please suggest!!!
You can "peek ahead" if you are loading from a previously loaded table and you specify the absolute row number. So assuming your data had been loaded into a table named "data', you could add the duplicate flag like this:
RIGHT JOIN LOAD DISTINCT *,
if(peek('StartDateTime') = StartDateTime OR peek('StartDateTime', recno(), 'data') = StartDateTime, 'Duplicate') as Dupe
RESIDENT data
;
A more flexible way to set the dupe flag may be something like this.
LEFT JOIN (data) LOAD StartDateTime, 'Duplicate' as DupeFlag
WHERE count > 1
;
LOAD StartDateTime, count(StartDateTime) as count
RESIDENT data
GROUP BY StartDateTime;
-Rob
Thanks Rob.
Later one worked.
I didn't get the first concept. Moreover, in later one, I have one query. When loading count > 1 record, no Resident or From clause specified for this load..
Arun Mittal wrote:When loading count > 1 record, no Resident or From clause specified for this load..
This is a "preceeding load". The input to this load is the following load where RESIDENT is specified. The bottom load is executed first and the data is piped into the preceeding load. This syntax was required because I could not create the count() value and test it in the same step.
-Rob