Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello Community,
I have a big question since it is drawling some big discrepancies in my thought process.
I am trying to count with a condition...
I am under the impression I should be able to do this with either of the two lines of code listed below; however, I get two different results. So my question is how do the two lines of code act differently? Or is there something being done wrong on the back end that would have caused this.
Count({<event_type={33,34}>}od_id)
vs
Count(if(event_type=33, od_id) or if(event_type=34, od_id) )
Note I have tried different variations of these.
I have also tried to diagnose this with only one event_type and still get two different results.
Count({<event_type={33}>}od_id) //will produce 102.1K
vs
Count(if(event_type=33, od_id) ) // will produce 173.6K
Clarification on how these two formulas are working differently would be greatly appreciated.
Hi,
Count({SetAnalysis}) = Override the current selection before evaluating the Count expression. This filter action is applied to the whole data model, not row-by-row
Count(IF()) = Evaluate the IF condition in each row, then apply the Count() function on top of the resulting dataset.
For example, let's say you have the following data model with 2 tables. Note that in Table 2 value (1,33) is duplicated.
1. Count({<event_type={33}>}od_id) = 2
2. Count(if(event_type=33, od_id) ) = 4
As a thumb rule, if condition should be used when evaluation needs to be done in a row-by-row basis. Otherwise it's better with Set Analysis.
Hope this helps!
BR,
Vu Nguyen
This was a great solution and works perfectly for what I am working on as well.
In addition to the answer above:
Generally, the differences between set analysis and an if function are:
1) Set analysis is a lot faster for large data amounts.
2) The Count(If(...)) is evaluated row-by-row, i.e. it's slow.
3) The Count(If(...)) must sometimes first create a temporary table: If "event_8type" and "od_id" reside in different tables. If so, the temporary table may contain more records than the original table, and Count() will return a different result than if you use set analysis.
Further, the following expression is however incorrect:
Count(if(event_type=33, od_id) or if(event_type=34, od_id) )
It should be written like this:
Count(if(event_type=33 or event_type=34, od_id) )
See also
https://community.qlik.com/t5/Design/Conditional-Aggregations/ba-p/1473362
https://community.qlik.com/t5/Design/Performance-of-Conditional-Aggregations/ba-p/1463021
Finally, are you sure you want to use a nondistinct count? When counting ID:s, you usually want Count(distinct ...).