Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
tduarte
Partner - Creator II
Partner - Creator II

Cumulative 12 months Sum on load script

Hi all,

I need to create a cumulative 12 months Sum(Revenue) on the load script, aggregated by month, category and reseller.

Meaning that for each month, the sum should be based on up to the 12 previous months.

MonthRevenueCumulative12
Jan-121010
Feb-121020
Mar-121030
Apr-121040
May-121050
Jun-121060
Jul-121070
Aug-121080
Sep-121090
Oct-1210100
Nov-1210110
Dec-1210120
Jan-1310120
Feb-1310120
Mar-1310120
Apr-1310120
May-1310120
Jun-1310120
Jul-1310120
Aug-1310120
Sep-1310120
Oct-1310120

The table contains many fields but I want the cumulative12 Sum, that will be used for banding, to be aggregated by month, category and  reseller.

I had a look at rangesum and peek not sure it will work here. I think it can be achieve with joins and group by.

Thanks.

10 Replies
Gysbert_Wassenaar

Rangesum and peek will work, but it's going to be quite an expression:

rangesum(Revenue, peek(Revenue), peek(Revenue, -2), peek(Revenue,-3), ...etc... , peek(Revenue,-11)) as Cumulative12

Perhaps an AsOf table is a better approach in this case. See this document: Calculating rolling n-period totals, averages or other aggregations


talk is cheap, supply exceeds demand
Not applicable

Hi Telmo

A simple solution would be to join on the revenue data for the single month 12 months previously, then cumulatively subtract at the sane time as the cumulative sum. This example will run by itself in a script:

//Generated Sales Data

GenData:

load rowno() as Month, round(rand()*100) as Sales autogenerate(36);

//Add in sales from the same table, as they were 12 months ago

left join (GenData) Load  Month + 12 as Month , Month  as MonthL12, Sales as SalesL12 resident GenData;

//Do the usual cumulative sum - but minus the value from 12 months ago

left join (GenData) Load Month,

numsum(Sales,Peek(CumSum) ) as CumSum,

numsum(Sales,Peek(CumSumL12) ,-SalesL12 ) as CumSumL12

  resident GenData;

Regards,

Erica

tduarte
Partner - Creator II
Partner - Creator II
Author

Hi G Wassenaar,

An AsOf table was my initial design but due to poor performance in the pivot table I need to try and create the cumulative revenues and the corresponding bands in the script.

The Rangesum didn't work for me because I need to group it by several dimensions.

tduarte
Partner - Creator II
Partner - Creator II
Author

Hi Erica,

I had a go using your suggestion and it's more or less what I was looking for but unfortunately the cumulative sums are incorrect because these are not aggregating by my other 2 fields, reseller and category.

This is what I've done:

Banding:

Load

  MonthStartDate as Banding.Date,

  Category as Banding.Category,

  ResellerID as Banding.ResellerID,

  Sum(Revenue) as Banding.Revenue

Resident Data

Group by MonthStartDate, Category, ResellerID;

//Add in sales from the same table, as they were 12 months ago

Left Join (Banding)

Load

  AddMonths(Banding.Date,12) as Banding.Date,

  Banding.Category,

  Banding.ResellerID,

  Banding.Date as Banding.DateL12,

  Banding.Revenue as Banding.RevenueL12

Resident Banding;

//Do the usual cumulative sum - but minus the value from 12 months ago

Left Join (Banding)

Load

  Banding.Date,

  Banding.Category,

  Banding.ResellerID,

  numsum(Banding.Revenue,Peek(CumSum)) as CumSum,

  numsum(Banding.Revenue,Peek(CumSumL12) ,-Banding.RevenueL12) as CumSumL12

Resident Banding;

Regards,

Telmo

Not applicable

Hi Telmo

You need to check that the Category and reseller in the row above are the same as the current row, before applying the cumulative sum. In this simplified example:

//Do the usual cumulative sum - but minus the value from 12 months ago

left join (GenData) Load Cat1, Cat2, Month,

if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2,    numsum(Sales,Peek(CumSum) ),0)                     as CumSum,

if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2,    numsum(Sales,Peek(CumSumL12) ,-SalesL12 ),0)     as CumSumL12

  resident GenData;

I've used the peek function again to verify that we are still in the same category as the row above, before adding the sales to the cumulative sum function.

Regards,

Erica

Not applicable

PS this full script will work on its own and serve as a test:

//Generate Categories:

GenData:

LOAD * INLINE [

    Cat1

    Birmingham

    London

    Newcastle

    Manchester

    Liverpool

    Bristol

    Bath

    Bournemouth

];

Left join (GenData) LOAD * INLINE [

    Cat2

    Bananas

    Olives

    Apples

    Celery

    Lettuce

];

   

//Add in the months

Left join (GenData)load rowno() as Month autogenerate(36);

//Add in random sales data

Left join (GenData) Load Cat1, Cat2, Month, round(rand()*100) as Sales resident GenData;

//Add in the previous 12 months total8/

left join (GenData) Load Cat1, Cat2,  Month + 12 as Month , Month  as MonthL12, Sales as SalesL12 resident GenData;

//Do the usual cumulative sum - but minus the value from 12 months ago

left join (GenData) Load Cat1, Cat2, Month,

if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2,    numsum(Sales,Peek(CumSum) ),0)                     as CumSum,

if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2,    numsum(Sales,Peek(CumSumL12) ,-SalesL12 ),0)     as CumSumL12

  resident GenData;

tduarte
Partner - Creator II
Partner - Creator II
Author

Thanks Erica,

So this mean that the only way of making it work is to do an Order By ResellerID, Category and Month - Month needs to be the last -  before doing the Peek / cumulative sum.

I guess I was hoping there would be a different way.

tduarte
Partner - Creator II
Partner - Creator II
Author

The cumulative sum expression should be:

if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2,   numsum(Sales,Peek(CumSum) ),Sales) as CumSum,

if(Peek(Cat1)&peek(Cat2) = Cat1&Cat2,   numsum(Sales,Peek(CumSumL12) ,-SalesL12 ),Sales) as CumSumL12

Otherwise the first row isn't taken into the sum.

Now I just realized this still doesn't work because my data doesn't have all the Months for each Category/ResellerID combination. This means that not all Month have a Month + 12 and SalesL12 won't be added in the join. As a result the CumSumL12 ends up being higher than what it should.

The sum should be based on up to the 12 previous months but not necessarily 12 records.

Capture.PNG.png

For example, in 2012-10-01, the Cumulative 12 Months sum should be 3415.26.

Any other ideas?

Not applicable

Hi Telmo, that's a tricky one!

You have 3 options:

1) Complete the months in the table using dummy month numbers (easiest way to do this would be an outer join on a straight list of months)

2) Completely different method- use "intervalmatch" to join the table onto itself using an interval of Month, and month -12. Then group by month and sum by all the sales of the joined table (not recommended as you will end up with a datset 12 X bigger)

or 3) join in any month -12 's that dont exist, and vice versa. Infact all you have to do to achieve this is change the bit where you join the Month-12 in to an outer join- which will fill in the blanks for you:

//Add in the previous 12 months total8/

outer join (GenData) Load Cat1, Cat2,  Month + 12 as Month , Month  as MonthL12, Sales as SalesL12 resident GenData;

Erica