Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Often you need to create conditional aggregations in QlikView, e.g. when you want to create a graph that shows this year’s numbers only, also if there are several years possible.

 

Image1 count.png

 

There are basically three ways to do this

  • A conditional expression outside the aggregation function, e.g. If(<Condition>, Sum(<Expression>))
  • A conditional expression inside the aggregation function, e.g. Sum(If(<Condition>,<Expression>))
  • Set Analysis, e.g. Sum( {<Set Expression>} <Expression> )

If you choose a conditional expression outside the aggregation function, you will have a condition that is evaluated once per dimensional value. Further, all three parameters of the If() function are aggregations, so you need to use aggregation functions, also in the condition, otherwise the expression will not be evaluated the way you want to.

So - don’t use naked field references!

   If( ShippingDate >= vReferenceDate, Sum( Amount ) )   // Incorrect !

   If( Min( ShippingDate ) >= vReferenceDate, Sum( Amount ) )   // Correct

If you instead put the conditional expression inside the aggregation function, you will have a very different situation: First, the condition will be evaluated on the record level of the source data. In other words: You may get performance problems if you have large data amounts.

   Sum( If( ShippingDate >= vReferenceDate, Amount ) )

Secondly, the aggregation function now contains an expression based on several fields (in the above example, ShippingDate and Amount), possibly from several source tables. This means that QlikView will aggregate over the Cartesian product of the included source tables. Normally this is not a problem, but in some odd cases, you will have results different from what you expect.

For instance, if the record with Amount has several shipping dates associated with it, the amount will be counted several times, once per shipping date, and you will get a result that you probably consider incorrect. There is usually a way to get around this problem by writing the expression differently, but if you can’t find one, you should use Set Analysis instead.

The conditional expression can be written in several ways:

  • String comparison:       If( Field = ‘string’, Amount )
  • Numeric comparison:   If( Field = number, Amount )
  • Boolean condition:       If( Flag, Amount )          e.g. Sum( If( IsThisYear, Amount ) )
  • Multiplication:              Flag * Amount              e.g. Sum( IsThisYear * Amount )

The two first examples contain comparisons, whereas the two last contain flags - Boolean fields created in the script. All four ways work fine, but I would recommend avoiding comparisons altogether. Use flags instead. See e.g. Year-over-Year Comparisons for more on flags.

Finally, you can choose to use Set Analysis. This is slightly different from other conditional expressions in that it uses the QlikView selection metaphor for the analysis: First, the Set Expression is interpreted as a selection, whereupon the aggregation is evaluated given this selection.

   Sum( {$<ShippingDate = {">='$(vReferenceDate)'"}>} Amount )

   Sum( {$<IsThisYear = {1}>} Amount )

This means that Set Analysis often is faster than using a conditional expression inside the aggregation. It also means that it calculates what you expect, as opposed to a case where an inside condition creates an unwanted Cartesian product.

However, a drawback with the Set Analysis is that it needs to be performed before QlikView performs the aggregation – you cannot have a Set Expression that evaluates to different values for different rows. The work-around is to calculate the condition in the script and store it in a flag.

Bottom line: Define flags in the script. And use Set Analysis.

HIC

 

Further reading related to this topic:

Performance of Conditional Aggregations

14 Comments
Creator
Creator

Great article!  I'm a big fan of defining flags in the script using "1" and "0" values, and multiplying them to the amount in the expressions.  They seem to perform calculations the quickest in most scenarios (but not always).

0 Likes
3,539 Views

I can only encourage you to continue using flags. They are often by far the best way to define conditions, and they simplify all expressions in the UI. And they can be used in Set Analysis also.

The blog post next week will be a comparison of the performance of the different methods to define conditions. So stay tuned...

HIC

0 Likes
3,539 Views
Partner
Partner

HIC,

superb, very nice blog

Best Regards,

Kalyan

0 Likes
3,539 Views
Specialist
Specialist

Thank you a lot for clarifying these topics in such clear and short way!

0 Likes
3,539 Views
Not applicable

Henric, thank you for this article, again it's very interesting and I'm just sad that I didn't dispose of such a great explanation when I started with qv.

I have a comment when conditional aggregation is used in for ex a list box. The differenence between:


if([Is ressource an agent?]='Yes',[Ressource name])

- the listbox will give you a list of every [Ressource name] that complies with the condition.

- selecting a ressource name will work as a normal list box (grey, white, green)

- selecting a ressource name also selects the condition, in this case the flag. So you will see two selections in the 'current selection box':

[Ressource name] = ...

[Is ressource an agent?] = 'Yes'

and:

aggr(only({<[Is ressource an agent?]={'Yes'}>}[Ressource name]),[Ressource name])

- the listbox will give you a list of every [Ressource name] that complies with the condition.

- selecting a ressource name will not work as a normal list box, the list itself will get smaller (only green)

- selecting a ressource name will not select the condition. So you will only see one selection in the 'current selection box':

[Ressource name] = ...

0 Likes
3,539 Views

Great!! Thanks.

0 Likes
3,539 Views