Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I need to split this string on each $A where interval between following $A-date is more than 60 days?
How would I achieve this?
RecordId, String
RecordId | OriginalString |
---|---|
1000 | A01.07.2015 00:00:00|$S07.07.2015 00:00:00|$B07.07.2015 00:00:00|$A10.07.2015$CK16.07.2015 00:00:00|$O16.07.2015 00:00:00|$A12.10.2015 00:00:00|$S12.10.2015 00:00:00|$CI16.10.2015 00:00:00|$A19.07.2016 00:00:00|$S20.07.2016 00:00:00|$B04.08.2016 00:00:00|$CI23.08.2016 00:00:00 |
Wanted output:
RecordId | SplitString |
---|---|
1000_1 | A01.07.2015 00:00:00|$S07.07.2015 00:00:00|$B07.07.2015 00:00:00|$A10.07.2015$CK16.07.2015 00:00:00|$O16.07.2015 00:00:00 |
1000_2 | $A12.10.2015 00:00:00|$S12.10.2015 00:00:00|$CI16.10.2015 00:00:00 |
1000_3 | $A19.07.2016 00:00:00|$S20.07.2016 00:00:00|$B04.08.2016 00:00:00|$CI23.08.2016 00:00:00 |
Can't use subfield as it would split into 4 rows, not taking my condition into consideration.
Try this....
Table:
LOAD RowNo() as Sort,
RecordId,
OriginalString,
'$A' & NewString as NewString,
Date(Floor(Date#(SubField(NewString, '|', 1), 'DD.MM.YYYY hh:mm:ss'))) as Date
Where Len(Trim(NewString)) > 0;
LOAD RecordId,
OriginalString,
SubField('|' & OriginalString, '|$A') as NewString;
LOAD * INLINE [
RecordId, OriginalString
1000, $A01.07.2015 00:00:00|$S07.07.2015 00:00:00|$B07.07.2015 00:00:00|$A10.07.2015 00:00:00|$CK16.07.2015 00:00:00|$O16.07.2015 00:00:00|$A12.10.2015 00:00:00|$S12.10.2015 00:00:00|$CI16.10.2015 00:00:00|$A19.07.2016 00:00:00|$S20.07.2016 00:00:00|$B04.08.2016 00:00:00|$CI23.08.2016 00:00:00
];
TempTable:
LOAD *,
If(Date <= Previous(Date) + 60, Alt(Peek('Flag'), 1), RangeSum(Peek('Flag'), 1)) as Flag
Resident Table;
FinalTable:
LOAD RecordId & '_' & Flag as RecordId,
Concat(DISTINCT NewString, '|') as SplitString
Resident TempTable
Group By RecordId, Flag;
DROP Table Table, TempTable;
Try this:
Table:
LOAD RowNo() as Sort,
*,
Date(Floor(Date#(KeepChar(NewString, '0123456789.: '), 'DD.MM.YYYY hh:mm:ss'))) as Date;
LOAD RecordId,
OriginalString,
SubField(OriginalString, '|') as NewString;
LOAD * INLINE [
RecordId, OriginalString
1000, A01.07.2015 00:00:00|$S07.07.2015 00:00:00|$B07.07.2015 00:00:00|$A10.07.2015 00:00:00|$CK16.07.2015 00:00:00|$O16.07.2015 00:00:00|$A12.10.2015 00:00:00|$S12.10.2015 00:00:00|$CI16.10.2015 00:00:00|$A19.07.2016 00:00:00|$S20.07.2016 00:00:00|$B04.08.2016 00:00:00|$CI23.08.2016 00:00:00
];
TempTable:
LOAD *,
If(Date <= Previous(Date) + 60, Alt(Peek('Flag'), 1), RangeSum(Peek('Flag'), 1)) as Flag
Resident Table;
FinalTable:
LOAD RecordId & '_' & Flag as RecordId,
Concat(DISTINCT NewString, '|') as SplitString
Resident TempTable
Group By RecordId, Flag;
DROP Table Table, TempTable;
Unfortunately not 100% correct Sunny. This only test for 60 days between one of the process steps. Not only the $A's.
So the comparison needs to be only for $A?
Yes, comparisons only for the $A's.
Try this....
Table:
LOAD RowNo() as Sort,
RecordId,
OriginalString,
'$A' & NewString as NewString,
Date(Floor(Date#(SubField(NewString, '|', 1), 'DD.MM.YYYY hh:mm:ss'))) as Date
Where Len(Trim(NewString)) > 0;
LOAD RecordId,
OriginalString,
SubField('|' & OriginalString, '|$A') as NewString;
LOAD * INLINE [
RecordId, OriginalString
1000, $A01.07.2015 00:00:00|$S07.07.2015 00:00:00|$B07.07.2015 00:00:00|$A10.07.2015 00:00:00|$CK16.07.2015 00:00:00|$O16.07.2015 00:00:00|$A12.10.2015 00:00:00|$S12.10.2015 00:00:00|$CI16.10.2015 00:00:00|$A19.07.2016 00:00:00|$S20.07.2016 00:00:00|$B04.08.2016 00:00:00|$CI23.08.2016 00:00:00
];
TempTable:
LOAD *,
If(Date <= Previous(Date) + 60, Alt(Peek('Flag'), 1), RangeSum(Peek('Flag'), 1)) as Flag
Resident Table;
FinalTable:
LOAD RecordId & '_' & Flag as RecordId,
Concat(DISTINCT NewString, '|') as SplitString
Resident TempTable
Group By RecordId, Flag;
DROP Table Table, TempTable;
Thank you very much Sunny. Learned a lot from this...
No problem at all... I am glad I was able to help you learn
Best,
Sunny