Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I´m doing some investigation on our Purchase-transactiondata and need to calculate a 3 month Rolling average per ProductID to identify "bad data" / outliers.
The reason for trying to do it in the load script, is 1) that I have millions of records and 2) hoping to do some automation in the process of handling the "bad data" / outliers.
I know that I can do something similar in Chart expressions, but finding it very hard to identify and handle "bad data" / outliers using Charts = a lot of manual work
Any ideas out there?
Hello,
I would suggest following structur:
load
sum(value) as Sumvalue
year/period
ProductID
resident your data group by year/period, ProductID where not bad data
then
give your year/period a number, oder by year / period
eg 201101=1, 201102 = 2, 201111=11, 201112=12, 201201=13, 201202=14 and so on.
Then calculate in a loop
sum(Sumvalue)
year/period
ProductID
resident Sumtable group by year/period, ProductID where periodnumber <= loop -3
Regards
hi check the attach example
Hi both,
Interesting - right now I´m having a look at your suggestions and will get back later, hopefully today.
Thanks for sharing 🙂
Hi Martin,
You pointed me in the right direction - ended up with a 2-step solution using WHILE and IterNo() for ordinary load and afterwards aggregation
// Loading each purchase-line up to 3 times (WHILE-syntax) and "moving" DeliveryMonth (DeliveryMonth_new) accordingly
PurchaseLines_3mth:
LOAD ProductID,Amount,Cost
,AddMonths(DeliveryMonth,IterNo()-1) AS DeliveryMonth_new
Resident PurchaseLines
while AddMonths(AddMonths(DeliveryMonth,-2),IterNo()-1) <= DeliveryMonth
// Aggregate by ProductID,DeliveryMonth_new
PurchaseLines_aggr:
NoConcatenate
LOAD
ProductID,DeliveryMonth_new
,Sum(Amount) AS AmtTotal3mth,Sum(Cost) AS CostTotal3mth,Count(ProductID) AS LinesTotal3mth
Resident PurchaseLines_3mth
Group BY ProductID,DeliveryMonth_new;
Drop table PurchaseLines_3mth;
The aggregated data is joined back on transactiondata and used for further calculations
Hi liron,
Thanks for mentioning the RangeAvg-function - I can use that in further calculations in my project