Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I have a table of dates that looks like this:
Cust_No Order_Date
---------------------------------
1 01/01/2013
1 02/01/2013
1 05/01/2013
1 07/01/2013
1 09/01/2013
I want to loop through these dates so that only dates that are at least 3 months from the index date are loaded. In addition, once a date is found that is at least 3 months from the index date, I want that date to become the new index date. So the final table in the example above would look like below:
Cust_No Order_Date
---------------------------------
1 01/01/2013
1 05/01/2013
1 09/01/2013
In this case 01/01/2013 is the index date. 02/01/2013 is not at least 3 months after 01/01/2013, so it isn't loaded. 05/01/2013 is at least 3 months from 01/01/2013, so that is loaded and becomes the new index date. 07/01/2013 is not at least 3 months away from the new index date of 05/01/2013, so it is not loaded. 09/01/2013 is, so it is loaded.
I know that Peek() can be used to loop, but I don't know how to loop in such a way that the index date can be continuously reset. I would honestly be find with a flag that can identify the appropriate dates (which I can filter later), but I don't know how to create a loop that does this. Is this possible?
Still without resorting to loops by doing FOR or WHILE it is possible to handle multiple customer no's and unsorted input. Have a look at the following two approaches where the first presumes pre-sorted by customer no and date and the second one has to use a resident load to perform sort (less efficient):
Adjusted to handle multiple customer #:
Adjusted to handle multiple customer # and unsorted source data:
Hi
I am working with an example. But is the order date always the first of a month?
E.g. if you have an orderdate the 15.01.2013 will 02.04.2013 be a new index date or will the first order date that can be an index date be 16.04.2013?
I took the liberty to add some more rows of data. See if this script help:
Table:
LOAD *,
RowNo() as RowNum
Inline [
Cust_No, Order_Date
1, 01/01/2013
1, 02/01/2013
1, 05/01/2013
1, 07/01/2013
1, 09/01/2013
1, 10/01/2013
1, 02/01/2014
1, 09/01/2014
1, 10/01/2014
1, 11/01/2014
1, 02/01/2015
];
LET vRowCount = Peek('RowNum', -1);
NewTable:
NoConcatenate
LOAD Cust_No,
Order_Date
Resident Table
Where RowNum = 1;
FOR i = 1 to $(vRowCount)
LET vDate = Num(AddMonths(Peek('Order_Date', -1, NewTable), 3));
Concatenate (NewTable)
LOAD FirstSortedValue(Cust_No, RowNum) as Cust_No,
FirstSortedValue(Order_Date, RowNum) as Order_Date,
Min(RowNum) as RowNum
Resident Table
Where Num(Order_Date) > $(vDate);
NEXT
DROP Table Table;
Output:
Actually it doesn't require any looping as far as I can see. A couple of preceding loads is all you need and that makes it very efficient:
for more than 1 customer
SET DateFormat='MM/DD/YYYY';
source:
load rowno() as id, * inline [
Cust_No , Order_Date
1 , 01/01/2013
1 , 02/01/2013
1 , 05/01/2013
1 , 07/01/2013
1 , 09/01/2013
2 , 01/01/2014
2 , 02/01/2014
2 , 05/01/2014
2 , 07/01/2014
2 , 09/01/2014
3 , 03/01/2014
3 , 02/01/2014
3 , 05/01/2014
3 , 07/01/2014
3 , 09/01/2014
3 , 09/01/2014
3 , 10/01/2014
3 , 11/01/2014
3 , 11/01/2014
];
final:
load *, if(Order_Date > AddMonths(Index_Date,3) or Order_Date=Index_Date,1,0) as Flag;
load
if(Peek('Cust_No')=Cust_No,
if(AddMonths(Order_Date,-3) > Peek('Index_Date'), Order_Date, Peek('Index_Date')),
Order_Date) as Index_Date,
Cust_No , Order_Date, id
Resident source
Order by
//Cust_No, id,
Cust_No , Order_Date
;
DROP Table source;
Hi
I have created a small algorithmn to solve this issue.
The starting point was a new list of orders (for two customers) as I assume that you want an index date per customer.
Here is the list that I used:
[Customer_ID | Customer_Date |
1 | 01.01.2014 |
1 | 10.01.2014 |
1 | 25.01.2014 |
1 | 23.02.2014 |
1 | 04.03.2014 |
1 | 19.03.2014 |
1 | 15.04.2014 |
1 | 29.05.2014 |
1 | 08.08.2014 |
1 | 09.08.2014 |
1 | 15.08.2014 |
1 | 12.12.2014 |
1 | 10.01.2015 |
2 | 01.01.2014 |
2 | 01.05.2014 |
2 | 10.06.2014 |
2 | 12.12.2014]; |
The dates in bold is the dates I expect to find as index dates. And here is the result from the run:
Customer_ID | Index_Date |
1 | 01.01.2014 |
1 | 15.04.2014 |
1 | 08.08.2014 |
1 | 12.12.2014 |
2 | 01.01.2014 |
2 | 01.05.2014 |
2 | 12.12.2014 |
Here is the algorithmn:
LET v_NUM_ROWS = NoOfRows('TEST_DATES');
Do while v_NUM_ROWS>0
// Find new index dates (per customer) in the remaining orders
INDEX_DATES:
LOAD Customer_ID, Min(Customer_Date) as Index_Date
Resident TEST_DATES
Group by Customer_ID;
// Add the currently latest index dates per customer
Left Join (TEST_DATES)
LOAD Customer_ID, Max(Index_Date) as Max_Index_Date
Resident INDEX_DATES
Group by Customer_ID;
// Prepare to remove all orders that will not set a new index date
RENAME Table TEST_DATES to old_TEST_DATES;
// Remove all orders that will not set a new index date
TEST_DATES:
NoConcatenate
LOAD Customer_ID, Customer_Date
Resident old_TEST_DATES
Where Customer_Date > AddMonths(Max_Index_Date,3);
DROP Table old_TEST_DATES;
// Check if there are more orders left to set new index dates
LET v_NUM_ROWS = NoOfRows('TEST_DATES');
Loop
DROP Table TEST_DATES;
Still without resorting to loops by doing FOR or WHILE it is possible to handle multiple customer no's and unsorted input. Have a look at the following two approaches where the first presumes pre-sorted by customer no and date and the second one has to use a resident load to perform sort (less efficient):
Adjusted to handle multiple customer #:
Adjusted to handle multiple customer # and unsorted source data:
Hi Petter
Thanks for sharing your solution. Interresting to read about the use of preceeding loads.
It solves the issue I tried to solve. I can see the result is the same as for my looping solution.
I had to test it
This is exactly what I needed and very simple. Thank you!
This is an interesting approach that was great to see. Thanks!