Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
hi all,
I need some help for an expression involving nested aggregation.
I have some sale data with sales for different countries. Fields are COUNTRY and SALE
I am building a straight table graph with average sales per country. No Pb. Dimension is COUNTRY and expressions are :
* AVG_SALE = avg(sale)
* STDEV_SALE = stdev(sale)
output then looks like
country avg_sale stdev_sale
FR 12 3
UK 15 4
...
Now I need to add a column CAP_AVG_SALE that would display the average of sales capped to (avg_sale+stdev_sale) per country, i.e if sale value exceeds the average+standard deviation for the country, we retain (avg_sale+stdev_sale).
The output should then look like
country avg_sale stdev_sale cap_avg_sale
FR 12 3 11.7
UK 15 4 14.6
...
My first try was to use an expression like
avg( RangeMin (SALE, avg(TOTAL SALE) + Stdev(TOTAL SALE)))
The problem with this is that the TOTAL is applied on the COUNTRY dimension so the average avg(TOTAL SALE) is the global average, but what I need in the nested avg() is the average per country
So I would need something more like
avg( RangeMin (SALE, avg({<COUNTRY={current country in the table row}>} TOTAL SALE) + Stdev({<COUNTRY={current country in the table row}>} TOTAL SALE)))
right now I cannot fin a formula to get the country from the current table row so I'm a bit stuck
so my question is :
1) Is there maybe a staistical function that I could use and does exactly what I need so I don't need to do these nested aggregations ?
2) if not, which expression could you advise ?
3) In my actual graph, COUNTRY is actually a dimension group. Hopefully the solution should also work with that
thanks a lot
Philippe
For nested aggregations you need to use the aggr function:
avg(aggr( RangeMin (SALE, avg(SALE) + Stdev(SALE))),country))
But since country is a group and not a field you need to use the getcurrentfield function:
avg(aggr( RangeMin (SALE, avg(SALE) + Stdev(SALE))), $(=getcurrentfield[country])) ))
Hi Gysbert,
thanks for the prompt answer, unfortunately it doesn't behave as expected : the expression output is the addition of avg_sale+stdev_sale. I think it's because the SALE that is compared to avg(SALE) in the RangeMin() is inside the aggr() ?
So it seems we would like to use the aggr() only for the inner avg() and stdev() like in
avg(RangeMin (sale, aggr(avg(sale) + Stdev(sale), country)))
which doesn't work because I think the aggr is an internal table of several lines when we need one line for the current country, hence my first idea of a set analysis ....
I enclose a basic test projet
best regards
Phlippe
Just to check, can you take a look at the second (lower) straight table in the attached qvw. Are the cap_stdev_sale values in that chart correct?
here is the file with the correct values. I have modified slightly the data samples as there were duplicates, and have modified your join which was a full cartesian
thanks
Ok, in case you want to do this without precalculating averages and stdevs in the script you can try this expression: avg(aggr(rangemin(only(sale),avg(total <country> sale) + stdev(total <country> sale)), country,sale))
that's correct !!
if you have time, could you give a quick explanation for the formula, it's still not cristal clear to me
also, how to use the dimension group instead of <country>
thanks !
The rangemin function should compare the sales values at the lowest aggregation level, i.e. country,sale. That's why the the outer avg is aggregating over the dimensions sale and country: avg(aggr(something),country,sale)). The rangemin should compare the lowest level sale with the avg and stdev of sale at the country level. That's why those get the total <country> modifier.
actually after adding a new dimension "product" and doing more tests, I had some discrepencies and I finally found out the the aggr () had to be nested inside the rangemin() :
avg(rangemin(sale, aggr(avg(total <country> sale)
+ stdev(total <country> sale), country, sale)))
since my dimension is in a group, the final formula is :
avg(rangemin(sale, aggr(avg(total <$(=getcurrentfield("group1"))> sale)
+ stdev(total <$(=getcurrentfield("group1"))> sale), $(=getcurrentfield("group1")), sale)))