Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I currently have the below data:
Id, Date
1 | 01/01/2011 |
1 | 01/02/2011 |
1 | 01/03/2011 |
1 | 01/06/2011 |
1 | 01/09/2011 |
2 | 01/01/2011 |
2 | 01/02/2011 |
2 | 01/04/2011 |
I would like to add an additional column of 'Output', which would be a sequential incrament based per id and then as the dates increases:
Id, Date Output
1 | 01/01/2011 | 1
1 | 01/02/2011 | 2
1 | 01/03/2011 | 3
1 | 01/06/2011 | 4
1 | 01/09/2011 | 5
2 | 01/01/2011 | 1
2 | 01/02/2011 | 2
2 | 01/04/2011 | 3
Any help would be much appreciated!
Thanks.
Hi,
Do the following
TableTemp:
LOAD * INLINE [
Id, Date
1, 01/01/2011
1, 01/02/2011
1, 01/03/2011
1, 01/06/2011
1, 01/09/2011
2, 01/01/2011
2, 01/02/2011
2, 01/04/2011
];
Table:
LOAD *,
If(Id = Previous(Id), RangeSum(1, Peek('Output')), 1) AS Output
RESIDENT TableTemp
ORDER BY Id;
DROP TABLE TableTemp;
Ok, elaborating on what I did: to be able to use the ORDER BY clause you need to do a RESIDENT load (assuming that your data is not going to be already ordered in the data source). Having this double step load, you can create a new field "Output" so if the Id is the same as in the previous record, adds 1 to the counter, otherwise, counter sets to 1 (the minimum possible for each Id).
Peek() returns the latest loaded value for the field specified (the value in the previous record, in this case). RangeSum() adds 1 to the retrieved value. It's not strictly needed, something like that instead
If(Id = Previous(Id), 1 + Peek('Output'), 1) AS Output
should work just fine as well. But RangeSum() will return 0 if any value is non numeric, which is very convenient.
Hope that helps.
Miguel
Hi,
From where this data is been loading, if this data is been loading from SQL Server then use the below script as a column in sql script.
row_number() over (partition by Id order by Id) as RowNumber
Hope this helps you.
Regards,
Jagan.
Hi,
Do the following
TableTemp:
LOAD * INLINE [
Id, Date
1, 01/01/2011
1, 01/02/2011
1, 01/03/2011
1, 01/06/2011
1, 01/09/2011
2, 01/01/2011
2, 01/02/2011
2, 01/04/2011
];
Table:
LOAD *,
If(Id = Previous(Id), RangeSum(1, Peek('Output')), 1) AS Output
RESIDENT TableTemp
ORDER BY Id;
DROP TABLE TableTemp;
Ok, elaborating on what I did: to be able to use the ORDER BY clause you need to do a RESIDENT load (assuming that your data is not going to be already ordered in the data source). Having this double step load, you can create a new field "Output" so if the Id is the same as in the previous record, adds 1 to the counter, otherwise, counter sets to 1 (the minimum possible for each Id).
Peek() returns the latest loaded value for the field specified (the value in the previous record, in this case). RangeSum() adds 1 to the retrieved value. It's not strictly needed, something like that instead
If(Id = Previous(Id), 1 + Peek('Output'), 1) AS Output
should work just fine as well. But RangeSum() will return 0 if any value is non numeric, which is very convenient.
Hope that helps.
Miguel
Great Miguel!
Regards,
Jagan.
Thanks!