Re: Calculating average of top N % of distribution - Qlik Community

Report Inappropriate Content · ‎2016-06-12

Hi all,

I am trying to calculate the average of a subset of a distribution. In the picture below I show two examples.

In the left chart, the average of the full distribution is calculated, with the following formula:

sum(Measure*Dimension)/sum(Measure)

However, what I'm trying to accomplish is calculating the average for the top 70% mass of the distribution. I am fairly certain that this can be achieved with a combination of the aggr, rank, rangesum, and similar functions, but I haven't gotten it to work.

Does anyone have any ideas?

Any help is greatly appreciated!

Best regards,

Niclas

Oleg_Troyansky · ‎2016-06-12

These are the kind of solutions that I teach at the Masters Summit for Qlik, within my lecture about Set Analysis and AGGR(). If you posted a more specific example, I could give you a more specific solution. In the absence of a specific formula, here is a generic one:

avg = avg( AGGR( sum( {<Dimension= {"=(calculation of the bucket)>=7 "} >} Measure) , Dimension))

In a nutshell, you need to average out the results only for those Dimension values that were associated with the top three buckets.

Notice that by Dimension here I mean the actual dimension that's participating in the histogram, not the buckets 1-9. Again, it would be easier to explain on a more specific example.

cheers,

Oleg Troyansky

Upgrade your Qlik skills at the Masters Summit for Qlik - coming soon to Austin, TX and Johannesburg, South Africa!

If you can't make it to the Masters Summit, check out my book QlikView Your Business.

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

Report Inappropriate Content · ‎2016-06-13

Dear Oleg,

Thanks for your reply. I will try to flesh out my problem a little, as well as provide a small sample application.

In my example, the total sum of my measure totals to 155. The measure is distributed over a dimension with values 1 to 9. The total distribution average, given the distribution, in this case comes out to 5.4 (see application). I want to know the average of the top 70% of my measure. In this case, I want to know the average of the 155*0.7=108.5 measure values which occur in the upper end of my distribution.

Thanks so much in advance!

Best regards,

Niclas