Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
robert99
Specialist III
Specialist III

A Primer on Set Analysis Blog Post

I cant post a comment on Henric's last blog post

A Primer on Set Analysis

And as no comments are posted. Is this a bug or are comments not allowed now

Your content could not be saved due to an error. You may have been logged out. If this problem persists please contact your system administrator. Click here to refresh this page.

A Primer on Set Analysis


Set analysis is one of the more complex things you can define in QlikView or Qlik Sense.

Agree.

I still use 'IF' when I can at times because its simple whereas Set analysis is confusing  when $'s and double quotes are used .(why its been done this way I'm not too sure)

For example

Sum( {$<Date={"<=$(=Max(Date))"}>} Sales)

compared to Sum(if (Date <= Max (Date) , Sales))


One is straight forward. The other very confusing (this should be simplified by Qlik to

Sum( {$<Date <=  {Max(Date)}>} Sales)


A question though

Is *= used to allow drill down

"If" allows drill down (for a filter selection contained in the expression)


eg  sum ( if Cust_Num = 1234 or Cust_num = 1235, Sales)

Does Sales ({<Cust_Num *= {1234,1235}>} Sales


only show sales for 1234 if the user makes a selection of 1234 (it seems to in a number of examples I have tested it on)

Thanks

17 Replies
rubenmarin

Hi RJ, in "({<Cust_Num *= {1234,1235}>} Sales" *= means intersection, it is what values are selected by user and in this values (1234, 1235)

robert99
Specialist III
Specialist III
Author

Thanks

But it means that a user can drill down ie selections work for Cust_Num whereas without the * it doesn't work for Cust_Num

So if the expression {1234,1235} intersects with the selection made either no selection for Cust_Num or either 1234 or 1235 then the total is given for either 1234 or 1235

This is the same as using if (I think).

hic
Former Employee
Former Employee

You are right that an If() function inside the aggregation function is simpler and easier to understand. Such an expression is evaluated row by row, and this makes it conceptually simpler. But this is also the drawback: It is slow, if you have large data amounts.

So when the Set Analysis was designed, the main goal was to make it fast. The solution was to make a selection "internal" to the aggregation function, so that you before the aggregation is calculated already have a binary vector pointing out the records to be included. But this also makes the Set Analysis conceptually complex.


On the syntax: Date <= {Max(Date)} is ambiguous, since you could have a field value 'Max(Date)', with brackets and everything. Further, if we use Date <= <something> as example, this means that QlikView internally first creates a For-Next loop over the distinct values of Date and for each date makes a comparison. But then you cannot use Date <= Max(Date) since the maximum inside the scope of the loop would be the same as the date. So you need to calculate Max(Date) in a larger scope than inside the For-Next loop. Hence, you need to calculate it before the For-Next loop is evaluated. This is why you need a dollar expansion.


Similar arguments can be made for most of the brackets inside the Set Analysis expression. As a result, the syntax is complex. But I frankly don't see that you can remove any of the brackets without getting logical problems or ambiguities.


HIC

robert99
Specialist III
Specialist III
Author

Thanks.

Luckily for me speed is still OK with  IF (its not  a huge database)

As I times I have tried to use a complex set analysis and have given up. Reverting to IF or a Max(date) in script.

rubenmarin

Ok, I think I understand now your question, your point is that anything that is done with set analysis can be done with if?

As Henric answers, the main (an possibly the only) reason to search for a solution using set analysis is because of performance, as a quick example:

If you use 'set analysis': it is calculated one time for the whole chart, it returns the filtered data, and based on that data it calculates the graph.

If you use an 'If': QV has to make the 'if' comparison for each combination of dimension values

hic
Former Employee
Former Employee

Set Analysis also allows you to make aggregations outside the set of possible records. For instance, if you select Month='May' you can make a Set Analysis expression that calculates the sales up until May, i.e. Jan-May. This is not possible with If() since Jan-Apr is excluded by the selection.

HIC

Anonymous
Not applicable

Henric,

The expression below can work only if field Date formatted as a plain number, because max() returns a number, and in this case it is "format-sensitive":

Sum( {$<Date={"<=$(=Max(Date))"}>} Sales)

I always use date() function for this:

Sum( {$<Date={"<=$(=date(Max(Date)))"}>} Sales)

Regards,
Michael

hic
Former Employee
Former Employee

You are absolutely right, Michael. I just didn't want to clutter the expression with too many functions...

Maybe the format-gnostic set analysis expression should be explained in its own blog post?

HIC

Anonymous
Not applicable

Consider I've given your an idea for the next blog post.  Looking forward to reading it soon...