Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Henric_Cronström
Not applicable

Set Analysis in the Aggr function

The Aggr() functions is one of the most advanced functions in the QIX engine, and it is not always easy to use. It does not get easier when you put set analysis expressions in it.

In one of my previous posts (Pitfalls of the Aggr function) I recommended having the set analysis expression in both the inner and the outer aggregation function when using set analysis in the Aggr() function. This is a good rule of thumb, because in most cases doing so will generate the result that you want.

But this is not always correct.

In more complex calculations you often need to use the condition in one place only – sometimes in the inner aggregation, sometimes in the outer. It depends on how the condition is formulated. Then it is important to understand the difference between the two positions.

The evaluation of the Aggr() function is a two-step process: In the first step, an intermediary virtual table is created using the inner aggregation and the dimension of the Aggr(). In the second step, this virtual table is aggregated using the outer aggregation.

For example, say that you want to find the largest order value per year. Then you would need to first calculate the sales value per order, and in a second step find the largest of these values. Hence

Formula1.png

The first step aggregates the source data (with multiple records per Order ID) into a virtual table with one record per Order ID, and the second step finds the largest values in the virtual table.

Two-steps.png

However, there is not yet any set analysis in the expression. So, let us use the following requirement instead:

  1. Show the largest order value per year
  2. Include only products from the product group “Shoes” in the order value
  3. Calculate these numbers only for 2014 and 2015

The two conditions correspond to the following set analysis expression:

Formula2.png

But where should this expression be written? In the outer or in the inner aggregation?

To answer this question, we must ask ourselves in which step the conditions should be used. Then it becomes obvious that the condition in product group must be used in step one – in the inner aggregation. If it is used in the outer aggregation only, the order values will be incorrect – they will be calculated from all products.

The condition in year, however, can be put in either place. Hence, the following expression will work fine:

Formula6.png

From the above example one might draw the conclusion that you always should put the condition in the inner aggregation. But this is not the case. Sometimes you have a condition that cannot be put in the inner aggregation. The following requirement can serve as example:

  1. Per year, show the bestselling product within the product group “Shoes”
  2. Show how this product ranks compared to all products, also non-shoes

The solution is the following table

Ranks.png

The condition in product group should be evaluated in step two, so the expressions used for Product and Rank are:

Formula5.png

Here it is not possible to have the condition on product group in the inner aggregation, since this would interfere with the calculation of the rank. You must have it in the outer aggregation.

Bottom lines are:

  • You need to figure out if your condition should be evaluated in step one (in the inner aggregation) or in step two (in the outer aggregation). This will determine where to put the set analysis expression. You may need to use The Feynman Problem Solving Algorithm.
  • If you can’t figure out where to put your set analysis expression – try putting it in both the outer and the inner aggregation function, and keep your fingers crossed. Afterwards you should however verify that the numbers are what you want.

HIC

Further reading related to this topic:

Pitfalls of the Aggr function

A Primer on Set Analysis

19 Comments
upaliwije
Not applicable

Great Post as always

0 Likes
2,550 Views
luciancotea
Not applicable

Hi hic,

Can you please include in the article also the Set Analysis of the aggr() function itself? It would be great.

Thanks,

Lucian

0 Likes
2,550 Views
Gysbert_Wassenaar
Not applicable

And where does the set expression for the aggr function itself fit in this?

Aggr({SetExpression}[DISTINCT] [NODISTINCT ] expr, StructuredParameter{,StructuredParameter})

0 Likes
2,550 Views
Henric_Cronström
Not applicable

Gysbert, Lucian

The Set Analysis of the Aggr() itself defines a scope for the aggregations inside the Aggr(). Or differently put: The Set Analysis expression of the Aggr() is injected into the relevant aggregation functions before they are calculated. This means that the two expressions below should return the same result.

   Aggr( {<SetExpression>} Sum(X)/Sum(Y), Dim)  

   Aggr( Sum({<SetExpression>} X)/Sum({<SetExpression>} Y), Dim)

Eventually there will come a blog post on this.

HIC

2,550 Views
swuehl
Not applicable

Henric, this (set analysis of the aggr() ) has been a recent change, right?

As far as I remember, it's been in the HELP for ages, but I don't remember it working until QV12.

[Just in case people wondering about the above syntax Gysbert showed:

The sortable Aggr function is finally here!

]

2,550 Views
Gysbert_Wassenaar
Not applicable

Does that mean that these two are equivalent too?

  • Aggr( {<SetExpressionA>} Sum({<SetExpressionB>} X)/Sum(Y), Dim) 
  • Aggr( Sum({<SetExpressionA>*<SetExpressionB>} X)/Sum({<SetExpressionA>} Y), Dim) 
2,550 Views
Henric_Cronström
Not applicable

Stefan

Yes, it is a recent change. Qlik Sense and and QlikView 12 only. I am frankly surprised that the older help files have a set analysis option in the Aggr() syntax...

HIC

0 Likes
2,550 Views
Henric_Cronström
Not applicable

Gysbert

Yes, the two expressions are equivalent.

In other places, there is a difference between {$<...>} and {<...>} where the first is a modification of the default selection, and the latter is a modification of the selection in that specific scope. Which makes a difference when you use alternate states. The idea is that you should have the same distinction inside an Aggr().

If you test it you will however find that it doesn't always work - the inner aggregation is always treated as a {<...>} even if you've written {$<...>}. So, we'll see if we can fix this.

HIC

2,550 Views
Gysbert_Wassenaar
Not applicable

Good, that probably means I understand it. I'm not sure that this 'shortcut' is worth the extra complexity though. Set analysis is hard enough to learn as it is. Bigger can, more worms

0 Likes
2,550 Views
luciancotea
Not applicable

Well, in fact it's older than that. I've been using it in v11.2 for a long time.

Edit: Just tested on 11.2 SR8 and it works.

0 Likes
2,550 Views
swuehl
Not applicable

That's interesting, Lucian. Just tried on QV11.20 SR11

=Aggr({1} If(Rank(Sum( AV_Menge))<=20,Only( Sammelartikel)) ,Sammelartikel)

did not work in a list box, I expected the same result like

=Aggr(If(Rank(Sum({1} AV_Menge))<=20,Only({1} Sammelartikel)) ,Sammelartikel)

I would be interested to see an example in QV11.20.

Could you upload a small sample QVW to a new thread?

0 Likes
2,550 Views
luciancotea
Not applicable

aggr set.png

This is in QlikView 11.2 SR 8. As you can see, the engine knows the syntax, it calculates it. But the set is applied to the outer result, at least that's how I interpreted the result.

0 Likes
2,550 Views
swuehl
Not applicable

I hope our little discussion is still at least partly on-topic.

Lucian, looking at your sample, it seems to me that the aggr() set analysis isn't considered at all: Isn't the result same as in th 4th column, the advanced aggregation without any set analysis applied?

That's what I've observed in pre QV12 versions, too, the expressions gets evaluated (no calculation error, just a syntax checker error), but the set analysis will just be ignored.

0 Likes
2,550 Views
luciancotea
Not applicable

Yeah, I didn't give it too much thought... I just applied HIC's rule of thumb: write it everywhere...

0 Likes
2,550 Views
jolivares
Not applicable

HI Henric, I was looking out this post and others about Aggr functions, in order to try to solve a problem with table.  Reviewing other post I'm sure that I making a "pitfalls" using that function.

The thing is: Top 5 Customers per ProductType for a given month.  The next expression give me the Rank form this customer

Aggr(Rank(Sum({<FiscalMonth={"$(=MaxString(FiscalMonth))"}>} Amount)),ProductType,Customer)<=5

But if I try to complicate using this result as an input for a set analysis I don't receive the expected result:

Sum({<Customer={"=Aggr(Rank(Sum({<FiscalMonth={'$(=MaxString(FiscalMonth))'}>} Amount)),ProductType,Customer)<=5"}>} Amount)

May you check what is my "pitfall".  Thanks

0 Likes
2,550 Views
swuehl
Not applicable

The pitfall is that the advanced search expression should give a single answer for each Customer, TRUE or FALSE.

Basically you are building a virtual table with Customer as dimension and a single expression

=Aggr(Rank(Sum({<FiscalMonth={'$(=MaxString(FiscalMonth))'}>} Amount)),ProductType,Customer)<=5

And for all kind of expressions, there is a general rule:

Use Aggregation Functions!

The Aggr() is missing the outer aggregation function, so Only() will be applied, and this is not what you want here, right?

0 Likes
2,550 Views
jolivares
Not applicable

Thanks swuehl for answer, really that I don't get the solution.  I open a case in order to get some post from the community.

May you check this out? I appreciate your help.

Aggr With Ranking

0 Likes
2,550 Views
yurgelmartina
Not applicable

Hello Henric,

Could you please help me understand this problem:

I have created a formula:

=IF(Sum(CIANO_SITE_LOJAS_UNIDADES)/Sum(RUPTURA_SITE_LOJA)>1,1,Sum(CIANO_SITE_LOJAS_UNIDADES)/Sum(RUPTURA_SITE_LOJA))

The values in SKU level look ok, but when I see the total I needed the average of 76%,100%,100%,100% (...) --> sould be 95% not 100%.

The issue is that the total does the formula itself.

Is there any way to solve this issue? Maybe use AGGR?

tries something like that but did not work at all.

=IF(aggr(SUM(CIANO_SITE_LOJAS_UNIDADES),SKU)/aggr(SUM(RUPTURA_SITE_LOJA),SKU)>1,1,aggr(SUM(CIANO_SITE_LOJAS_UNIDADES),SKU)/aggr(SUM(RUPTURA_SITE_LOJA),SKU))

Apreciate your help.

sku.JPG

0 Likes
2,550 Views
Henric_Cronström
Not applicable

I think you should try the following two alternatives:

1) An unweighted average of the different SKU:s:

   Avg(Aggr(RangeMin(Sum(CIANO_SITE_LOJAS_UNIDADES)/Sum(RUPTURA_SITE_LOJA),1),SKU))

Here, the RangeMin() assures that the ratio never is greater than 1.

2) A weighted average of the different SKU:s:

   Sum(Aggr(RangeMin(Sum(CIANO_SITE_LOJAS_UNIDADES),Sum(RUPTURA_SITE_LOJA)),SKU))

   /

   Sum(Aggr(Sum(RUPTURA_SITE_LOJA),SKU))

Here the RangeMin() in the numerator assures that the Sum(CIANO_SITE_LOJAS_UNIDADES) never is greater than Sum(RUPTURA_SITE_LOJA).

In both cases, I have used SKU as inner dimension in the Aggr(), but it is likely that you need to replace this by all dimensions of the chart, i.e.

   Aggr(..., SKU, HBT, BUFFER_1_LOJA, CODCD)

HIC

0 Likes
2,550 Views