Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a situation where I only want to load the last 4 records for each Account_ID. In the sample data you will see that there is not always a consecutive YearPeriod for each Account_ID.
For Account_ID 300000001982 I would expect to load only rows for:
YearPeriod 201105, 201108, 201109 and 201110 : This would mean that we did not receive payment in payment batches for 201106 and 201107.
For Account_ID 30000000999 only the rows for:
YearPeriod 201104, 201105, 201106 and 201107 : This data would mean the last 4 payments are regular.
I appreciated your assistance. Thank you in advance.
May be this:
TMP:
Load * INLINE [
Account_ID, YearPeriod, Period, Batch_ID
300000001982, 201008, 8, 1
300000001982, 201009, 9, 2
300000001982, 201010, 10, 3
300000001982, 201011, 11, 4
300000001982, 201012, 12, 5
300000001982, 201101, 1, 6
300000001982, 201102, 2, 7
300000001982, 201103, 3, 8
300000001982, 201104, 4, 9
300000001982, 201105, 5, 10
300000001982, 201108, 8, 13
300000001982, 201109, 9, 14
300000001982, 201110, 10, 15
300000001999, 201008, 8, 1
300000001999, 201009, 9, 2
300000001999, 201010, 10, 3
300000001999, 201011, 11, 4
300000001999, 201012, 12, 5
300000001999, 201101, 1, 6
300000001999, 201104, 4, 9
300000001999, 201105, 5, 10
300000001999, 201106, 6, 11
300000001999, 201107, 7, 12
];
TMP2:
LOAD *
Where Key < 5;
LOAD *,
AutoNumber(YearPeriod, Account_ID) as Key
Resident TMP
Order By Account_ID, YearPeriod DESC;
DROP Table TMP;
Hello John,
this is a bad written script but i works as expected.
There should be a better solution but it works.
SOURCE_DATA:
Load * INLINE [
Account_ID, YearPeriod, Period, Batch_ID
300000001982, 201008, 8, 1
300000001982, 201009, 9, 2
300000001982, 201010, 10, 3
300000001982, 201011, 11, 4
300000001982, 201012, 12, 5
300000001982, 201101, 1, 6
300000001982, 201102, 2, 7
300000001982, 201103, 3, 8
300000001982, 201104, 4, 9
300000001982, 201105, 5, 10
300000001982, 201108, 8, 13
300000001982, 201109, 9, 14
300000001982, 201110, 10, 15
300000001999, 201008, 8, 1
300000001999, 201009, 9, 2
300000001999, 201010, 10, 3
300000001999, 201011, 11, 4
300000001999, 201012, 12, 5
300000001999, 201101, 1, 6
300000001999, 201104, 4, 9
300000001999, 201105, 5, 10
300000001999, 201106, 6, 11
300000001999, 201107, 7, 12
];
T1:
Load Distinct Account_ID Resident SOURCE_DATA;
For i = 1 to NoOfRows('T1')
LET vAccID = peek('Account_ID', $(i)-1, 'T1');
//Get latst 4 values for YearPeriod
For x = 1 to 4
TMP_MAX:
Load max(YearPeriod,$(x)) as MAX Resident SOURCE_DATA WHERE Account_ID = $(vAccID) ;//AND RecNo()<=4;
next x
//concat last 4 values to use it in Match
TMP: Load concat(MAX, ',') as MAX4 Resident TMP_MAX;
LET vLast4Periods = peek('MAX4', -1, 'TMP');
drop table TMP_MAX;
//add a new field to stop concatenating this table to SOURCE_DATA and drop it later
NEW:
Load 1 as TEST, * Resident SOURCE_DATA WHERE Account_ID = $(vAccID) AND match(YearPeriod, $(vLast4Periods));
Next i
//cleanup
drop Field TEST;
drop tables SOURCE_DATA, T1, TMP;
May be this:
TMP:
Load * INLINE [
Account_ID, YearPeriod, Period, Batch_ID
300000001982, 201008, 8, 1
300000001982, 201009, 9, 2
300000001982, 201010, 10, 3
300000001982, 201011, 11, 4
300000001982, 201012, 12, 5
300000001982, 201101, 1, 6
300000001982, 201102, 2, 7
300000001982, 201103, 3, 8
300000001982, 201104, 4, 9
300000001982, 201105, 5, 10
300000001982, 201108, 8, 13
300000001982, 201109, 9, 14
300000001982, 201110, 10, 15
300000001999, 201008, 8, 1
300000001999, 201009, 9, 2
300000001999, 201010, 10, 3
300000001999, 201011, 11, 4
300000001999, 201012, 12, 5
300000001999, 201101, 1, 6
300000001999, 201104, 4, 9
300000001999, 201105, 5, 10
300000001999, 201106, 6, 11
300000001999, 201107, 7, 12
];
TMP2:
LOAD *
Where Key < 5;
LOAD *,
AutoNumber(YearPeriod, Account_ID) as Key
Resident TMP
Order By Account_ID, YearPeriod DESC;
DROP Table TMP;
Hello Sunny.
Thank you for the answer. Your solution worked perfectly, and effieciantly.
For 23386 Accounts with 859348 records the reload time is 2 minutes 27 seconds.
Thank you very much.
Hello Tim.
Thank you for your suggestion. The solution does work technically, but it took very longs to run.
On a smaller data set it would be perfect.
Thanks