Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
I need to create a cumulative 12 months Sum(Revenue) on the load script, aggregated by month, category and reseller.
Meaning that for each month, the sum should be based on up to the 12 previous months.
Month | Revenue | Cumulative12 |
Jan-12 | 10 | 10 |
Feb-12 | 10 | 20 |
Mar-12 | 10 | 30 |
Apr-12 | 10 | 40 |
May-12 | 10 | 50 |
Jun-12 | 10 | 60 |
Jul-12 | 10 | 70 |
Aug-12 | 10 | 80 |
Sep-12 | 10 | 90 |
Oct-12 | 10 | 100 |
Nov-12 | 10 | 110 |
Dec-12 | 10 | 120 |
Jan-13 | 10 | 120 |
Feb-13 | 10 | 120 |
Mar-13 | 10 | 120 |
Apr-13 | 10 | 120 |
May-13 | 10 | 120 |
Jun-13 | 10 | 120 |
Jul-13 | 10 | 120 |
Aug-13 | 10 | 120 |
Sep-13 | 10 | 120 |
Oct-13 | 10 | 120 |
The table contains many fields but I want the cumulative12 Sum, that will be used for banding, to be aggregated by month, category and reseller.
I had a look at rangesum and peek not sure it will work here. I think it can be achieve with joins and group by.
Thanks.
Rangesum and peek will work, but it's going to be quite an expression:
rangesum(Revenue, peek(Revenue), peek(Revenue, -2), peek(Revenue,-3), ...etc... , peek(Revenue,-11)) as Cumulative12
Perhaps an AsOf table is a better approach in this case. See this document: Calculating rolling n-period totals, averages or other aggregations
Hi Telmo
A simple solution would be to join on the revenue data for the single month 12 months previously, then cumulatively subtract at the sane time as the cumulative sum. This example will run by itself in a script:
//Generated Sales Data
GenData:
load rowno() as Month, round(rand()*100) as Sales autogenerate(36);
//Add in sales from the same table, as they were 12 months ago
left join (GenData) Load Month + 12 as Month , Month as MonthL12, Sales as SalesL12 resident GenData;
//Do the usual cumulative sum - but minus the value from 12 months ago
left join (GenData) Load Month,
numsum(Sales,Peek(CumSum) ) as CumSum,
numsum(Sales,Peek(CumSumL12) ,-SalesL12 ) as CumSumL12
resident GenData;
Regards,
Erica
Hi G Wassenaar,
An AsOf table was my initial design but due to poor performance in the pivot table I need to try and create the cumulative revenues and the corresponding bands in the script.
The Rangesum didn't work for me because I need to group it by several dimensions.
Hi Erica,
I had a go using your suggestion and it's more or less what I was looking for but unfortunately the cumulative sums are incorrect because these are not aggregating by my other 2 fields, reseller and category.
This is what I've done:
Banding:
Load
MonthStartDate as Banding.Date,
Category as Banding.Category,
ResellerID as Banding.ResellerID,
Sum(Revenue) as Banding.Revenue
Resident Data
Group by MonthStartDate, Category, ResellerID;
//Add in sales from the same table, as they were 12 months ago
Left Join (Banding)
Load
AddMonths(Banding.Date,12) as Banding.Date,
Banding.Category,
Banding.ResellerID,
Banding.Date as Banding.DateL12,
Banding.Revenue as Banding.RevenueL12
Resident Banding;
//Do the usual cumulative sum - but minus the value from 12 months ago
Left Join (Banding)
Load
Banding.Date,
Banding.Category,
Banding.ResellerID,
numsum(Banding.Revenue,Peek(CumSum)) as CumSum,
numsum(Banding.Revenue,Peek(CumSumL12) ,-Banding.RevenueL12) as CumSumL12
Resident Banding;
Regards,
Telmo
Hi Telmo
You need to check that the Category and reseller in the row above are the same as the current row, before applying the cumulative sum. In this simplified example:
//Do the usual cumulative sum - but minus the value from 12 months ago
left join (GenData) Load Cat1, Cat2, Month,
if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2, numsum(Sales,Peek(CumSum) ),0) as CumSum,
if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2, numsum(Sales,Peek(CumSumL12) ,-SalesL12 ),0) as CumSumL12
resident GenData;
I've used the peek function again to verify that we are still in the same category as the row above, before adding the sales to the cumulative sum function.
Regards,
Erica
PS this full script will work on its own and serve as a test:
//Generate Categories:
GenData:
LOAD * INLINE [
Cat1
Birmingham
London
Newcastle
Manchester
Liverpool
Bristol
Bath
Bournemouth
];
Left join (GenData) LOAD * INLINE [
Cat2
Bananas
Olives
Apples
Celery
Lettuce
];
//Add in the months
Left join (GenData)load rowno() as Month autogenerate(36);
//Add in random sales data
Left join (GenData) Load Cat1, Cat2, Month, round(rand()*100) as Sales resident GenData;
//Add in the previous 12 months total8/
left join (GenData) Load Cat1, Cat2, Month + 12 as Month , Month as MonthL12, Sales as SalesL12 resident GenData;
//Do the usual cumulative sum - but minus the value from 12 months ago
left join (GenData) Load Cat1, Cat2, Month,
if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2, numsum(Sales,Peek(CumSum) ),0) as CumSum,
if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2, numsum(Sales,Peek(CumSumL12) ,-SalesL12 ),0) as CumSumL12
resident GenData;
Thanks Erica,
So this mean that the only way of making it work is to do an Order By ResellerID, Category and Month - Month needs to be the last - before doing the Peek / cumulative sum.
I guess I was hoping there would be a different way.
The cumulative sum expression should be:
if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2, numsum(Sales,Peek(CumSum) ),Sales) as CumSum,
if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2, numsum(Sales,Peek(CumSumL12) ,-SalesL12 ),Sales) as CumSumL12
Otherwise the first row isn't taken into the sum.
Now I just realized this still doesn't work because my data doesn't have all the Months for each Category/ResellerID combination. This means that not all Month have a Month + 12 and SalesL12 won't be added in the join. As a result the CumSumL12 ends up being higher than what it should.
The sum should be based on up to the 12 previous months but not necessarily 12 records.
For example, in 2012-10-01, the Cumulative 12 Months sum should be 3415.26.
Any other ideas?
Hi Telmo, that's a tricky one!
You have 3 options:
1) Complete the months in the table using dummy month numbers (easiest way to do this would be an outer join on a straight list of months)
2) Completely different method- use "intervalmatch" to join the table onto itself using an interval of Month, and month -12. Then group by month and sum by all the sales of the joined table (not recommended as you will end up with a datset 12 X bigger)
or 3) join in any month -12 's that dont exist, and vice versa. Infact all you have to do to achieve this is change the bit where you join the Month-12 in to an outer join- which will fill in the blanks for you:
//Add in the previous 12 months total8/
outer join (GenData) Load Cat1, Cat2, Month + 12 as Month , Month as MonthL12, Sales as SalesL12 resident GenData;
Erica