Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Guys,
I am working on company's financial data. Here's my question:
Can you please give an example of a function where I want to populate a null value by calculating average of other values for a specific range in a date dimension.
Ex:
We have 3 fields : Account No, Date and Cost. Now I see some null values in cost variable. I want to replace it with average of other cost values and use grouping by Date and Account for that null value.
Can anyone help on this?
i will use a simple data set to illustrate the logic. 3 accounts, cost is monthly - 1 row per account per month - thats the granularity.
i would use table operations for this. first establish the business rule. for example, get the null cost from prior n months. so for one account that has a NULL cost for 9/1/2022, ill total all the cost from 6/1/2022 to 8/1/2022:
data:
load [Account No], Date, if(Cost>=100, Cost) as Cost;
load [Account No], date(addmonths(monthstart(today()),-IterNo()+1)) as Date, floor(rand()*1000) as Cost
while IterNo()<24;
load * inline [
Account No
A001
A002
A003
];
//this gets all rows with null cost
NoConcatenate tmpData:
load * Resident data
where isnull(Cost);
//inner join to get all other rows to aggregate
Inner join (tmpData)
load [Account No], Date as tmpDate, Cost as tmpCost
Resident data
where not isnull(Cost);
//filter out rows that dont satisfy business rules
NoConcatenate
newTmpData:
load * Resident tmpData
where Date>=tmpDate and tmpDate>=AddMonths(Date,-3);
//the sample business rule is SUM all costs 3 months back
//i used sum so it is easy to test but you can use any rule you want likely AVG
//aggregate the costs
NoConcatenate
aggrData:
load [Account No], Date, sum(tmpCost) as sumCost
Resident newTmpData group by [Account No], Date;
//add it back, i separated the field so you can see it
left join (data) load * Resident aggrData;
drop table tmpData, newTmpData, aggrData;
so sample null data point is 4/1/2022 - sums 3/1/2022 and 1/1/2022: