Re: Aggr() aggravation - Qlik Community

Report Inappropriate Content · ‎2015-11-04

I'm trying to build a measure to show the expected number of diseases in a local population with known age distribution (and where the disease rate varies by age). To do this, I multiply the population at each age group by the rates of disease at the national level, i.e.

Sum( Aggr( Sum( Population ) * Sum( DiseaseCases_National ) / Sum( Population_National ), AgeGroup )

To make this work on a chart with drilldowns for both location and disease type dimensions, I've had to add both the Location grain and the DiseaseType grain to the AgeGroup (see footnote for details).

This not only slowed things down significantly, but it's also technically wrong (or at least not to spec), since it means that location and disease type are being used as weighting variables - as opposed to being exploratory dimensions, i.e. the measure (expected cases) should be sliceable and collapsible by location and disease type.

Any workaround (or optimization) tips on this would be much appreciated.

Dannie

P.S. HIC provides an excellent discussion of this Aggr() function pitfall here:

https://community.qlik.com/blogs/qlikviewdesignblog/2015/10/05/pitfalls-of-the-aggr-function

Somehow I find it odd that Aggr() ignores row/col selections (or "outer aggregations", in Henric's terms). I would've thought this is the role of the "Total" qualifier, i.e. to sum across a dimension ignoring its filters or selections on the chart.

jonathandienst · ‎2015-11-05

I assume you mean changing the expression to

=Sum(Aggr(Sum(Population) * Sum(DiseaseCases_National) / Sum(Population_National), AgeGroup, Location, DiseaseType))

Perhaps your need this:

=Sum(Aggr(Sum(Population) * Sum(DiseaseCases_National) / Sum(TOTAL <Location, DiseaseType> Population_National), AgeGroup, Location, DiseaseType))

This will not necessarily improve performance, but it should ensure that the more granular values will sum up to the original values.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein

Report Inappropriate Content · ‎2015-11-05

Not quite. I think you meant this:

= Sum(Aggr(Sum(Population) * Sum(TOTAL <Location, DiseaseType> DiseaseCases_National) / Sum(TOTAL <Location, DiseaseType> Population_National), AgeGroup, Location, DiseaseType))

... which yields the same results as this:

= Sum(Population) * Sum(DiseaseCases_National) / Sum(Population_National)

where the rate is not stratified by age. Below table illustrates the formula:

National National Local Local

Age Disease Population Population Disease Expected

------------------------------------------------------------------------------------------------------------------------------------

40 5 100 10 = (5/100)*10 0.5

50 10 200 20 = (10/200)*20 1.0

60 20 100 50 = (20/100)*50 10.0

------------------------------------------------------------------------------------------------------------------------------------

Expected Cases = 11.5

Again, ideally, this should work regardless of other dimensions on the chart, with only Age as the aggregation (or stratification) variable, i.e. = Sum(Aggr(Sum(Population) * Sum(DiseaseCases_National) / Sum(Population_National), AgeGroup))

But when I add location (or another non-age dimension) to the chart, the measure is calculated for only one subgroup, e.g.

Location Expected Cases

A 11.5

B -

C -

I suppose the behavior I'm looking for is similar to the following DAX formula in Excel PowerPivot:

= SUMX( SUMMARIZE( AgeDimension, AgeGroupKey, [Rate National] ), [Rate National]*[Local Population] )

which works within any dimension context on the pivot table or chart.