Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

need help on a nested aggregation

hi all,

I need some help for an expression involving nested aggregation.

I have some sale data with sales for different countries. Fields are COUNTRY and SALE

I am building a straight table graph  with average sales per country. No Pb. Dimension is COUNTRY and expressions are :

* AVG_SALE = avg(sale)

* STDEV_SALE = stdev(sale)

output then looks like

country    avg_sale       stdev_sale

FR         12             3

UK         15             4

...

Now I need to add a column CAP_AVG_SALE that would display the average of sales capped to (avg_sale+stdev_sale) per country, i.e if sale value exceeds the average+standard deviation for the country, we retain (avg_sale+stdev_sale).

The output should then look like

country    avg_sale       stdev_sale   cap_avg_sale

FR         12             3            11.7

UK         15             4            14.6

...

My first try was to use an expression like

avg( RangeMin (SALE, avg(TOTAL SALE) + Stdev(TOTAL SALE)))

The problem with this is that the TOTAL is applied on the COUNTRY dimension so the average avg(TOTAL SALE) is the global average, but what I need in the nested avg() is the average per country

So I would need something more like

avg( RangeMin (SALE, avg({<COUNTRY={current country in the table row}>} TOTAL SALE) + Stdev({<COUNTRY={current country in the table row}>} TOTAL SALE)))

right now I cannot fin a formula to get the country from the current table row so I'm a bit stuck

so my question is :

1) Is there maybe a staistical function that I could use and does exactly what I need so I don't need to do these nested aggregations ?

2) if not, which expression could you advise ?

3) In my actual graph, COUNTRY is actually a dimension group. Hopefully the solution should also work with that

thanks a lot

Philippe

8 Replies
Gysbert_Wassenaar
Partner - Champion III
Partner - Champion III

For nested aggregations you need to use the aggr function:

avg(aggr( RangeMin (SALE, avg(SALE) + Stdev(SALE))),country))


But since country is a group and not a field you need to use the getcurrentfield function:


avg(aggr( RangeMin (SALE, avg(SALE) + Stdev(SALE))), $(=getcurrentfield[country])) ))


talk is cheap, supply exceeds demand
Not applicable
Author

Hi Gysbert,

thanks for the prompt answer, unfortunately it doesn't behave as expected : the expression output is the addition of avg_sale+stdev_sale. I think it's because the SALE that is compared to avg(SALE) in the RangeMin() is inside the aggr() ?

So it seems we would like to use the aggr() only for the inner avg() and stdev() like in

avg(RangeMin (sale, aggr(avg(sale) + Stdev(sale), country)))

which doesn't work because I think the aggr is an internal table of several lines when we need one line for the current country, hence my first idea of a set analysis ....

I enclose a basic test projet

best regards

Phlippe

Gysbert_Wassenaar
Partner - Champion III
Partner - Champion III

Just to check, can you take a look at the second (lower) straight table in the attached qvw. Are the cap_stdev_sale values in that chart correct?


talk is cheap, supply exceeds demand
Not applicable
Author

here is the file with the correct values. I have modified slightly the data samples as there were duplicates, and have modified your join which was a full cartesian

thanks

Gysbert_Wassenaar
Partner - Champion III
Partner - Champion III

Ok, in case you want to do this without precalculating averages and stdevs in the script you can try this expression: avg(aggr(rangemin(only(sale),avg(total <country> sale) + stdev(total <country> sale)), country,sale))


talk is cheap, supply exceeds demand
Not applicable
Author

that's correct !!

if you have time, could you give a quick explanation for the formula, it's still not cristal clear to me

also, how to use the dimension group instead of <country>

thanks !

Gysbert_Wassenaar
Partner - Champion III
Partner - Champion III

The rangemin function should compare the sales values at the lowest aggregation level, i.e. country,sale. That's why the the outer avg is aggregating over the dimensions sale and country: avg(aggr(something),country,sale)). The rangemin should compare the lowest level sale with the avg and stdev of sale at the country level. That's why those get the total <country> modifier.


talk is cheap, supply exceeds demand
Not applicable
Author

actually after adding a new dimension "product" and doing more tests, I had some discrepencies and I finally found out the the aggr () had to be nested inside the rangemin() :

avg(rangemin(sale, aggr(avg(total <country> sale)

   + stdev(total <country> sale), country, sale)))

since my dimension is in a group, the final formula is :

avg(rangemin(sale, aggr(avg(total <$(=getcurrentfield("group1"))> sale)

   + stdev(total <$(=getcurrentfield("group1"))> sale), $(=getcurrentfield("group1")), sale)))