Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I cant post a comment on Henric's last blog post
And as no comments are posted. Is this a bug or are comments not allowed now
Your content could not be saved due to an error. You may have been logged out. If this problem persists please contact your system administrator. Click here to refresh this page.
Set analysis is one of the more complex things you can define in QlikView or Qlik Sense.
Agree.
I still use 'IF' when I can at times because its simple whereas Set analysis is confusing when $'s and double quotes are used .(why its been done this way I'm not too sure)
For example
Sum( {$<Date={"<=$(=Max(Date))"}>} Sales)
compared to Sum(if (Date <= Max (Date) , Sales))
One is straight forward. The other very confusing (this should be simplified by Qlik to
Sum( {$<Date <= {Max(Date)}>} Sales)
A question though
Is *= used to allow drill down
"If" allows drill down (for a filter selection contained in the expression)
eg sum ( if Cust_Num = 1234 or Cust_num = 1235, Sales)
Does Sales ({<Cust_Num *= {1234,1235}>} Sales
only show sales for 1234 if the user makes a selection of 1234 (it seems to in a number of examples I have tested it on)
Thanks
Hi RJ, in "({<Cust_Num *= {1234,1235}>} Sales" *= means intersection, it is what values are selected by user and in this values (1234, 1235)
Thanks
But it means that a user can drill down ie selections work for Cust_Num whereas without the * it doesn't work for Cust_Num
So if the expression {1234,1235} intersects with the selection made either no selection for Cust_Num or either 1234 or 1235 then the total is given for either 1234 or 1235
This is the same as using if (I think).
You are right that an If() function inside the aggregation function is simpler and easier to understand. Such an expression is evaluated row by row, and this makes it conceptually simpler. But this is also the drawback: It is slow, if you have large data amounts.
So when the Set Analysis was designed, the main goal was to make it fast. The solution was to make a selection "internal" to the aggregation function, so that you before the aggregation is calculated already have a binary vector pointing out the records to be included. But this also makes the Set Analysis conceptually complex.
On the syntax: Date <= {Max(Date)} is ambiguous, since you could have a field value 'Max(Date)', with brackets and everything. Further, if we use Date <= <something> as example, this means that QlikView internally first creates a For-Next loop over the distinct values of Date and for each date makes a comparison. But then you cannot use Date <= Max(Date) since the maximum inside the scope of the loop would be the same as the date. So you need to calculate Max(Date) in a larger scope than inside the For-Next loop. Hence, you need to calculate it before the For-Next loop is evaluated. This is why you need a dollar expansion.
Similar arguments can be made for most of the brackets inside the Set Analysis expression. As a result, the syntax is complex. But I frankly don't see that you can remove any of the brackets without getting logical problems or ambiguities.
HIC
Thanks.
Luckily for me speed is still OK with IF (its not a huge database)
As I times I have tried to use a complex set analysis and have given up. Reverting to IF or a Max(date) in script.
Ok, I think I understand now your question, your point is that anything that is done with set analysis can be done with if?
As Henric answers, the main (an possibly the only) reason to search for a solution using set analysis is because of performance, as a quick example:
If you use 'set analysis': it is calculated one time for the whole chart, it returns the filtered data, and based on that data it calculates the graph.
If you use an 'If': QV has to make the 'if' comparison for each combination of dimension values
Set Analysis also allows you to make aggregations outside the set of possible records. For instance, if you select Month='May' you can make a Set Analysis expression that calculates the sales up until May, i.e. Jan-May. This is not possible with If() since Jan-Apr is excluded by the selection.
HIC
Henric,
The expression below can work only if field Date formatted as a plain number, because max() returns a number, and in this case it is "format-sensitive":
Sum( {$<Date={"<=$(=Max(Date))"}>} Sales)
I always use date() function for this:
Sum( {$<Date={"<=$(=date(Max(Date)))"}>} Sales)
Regards,
Michael
You are absolutely right, Michael. I just didn't want to clutter the expression with too many functions...
Maybe the format-gnostic set analysis expression should be explained in its own blog post?
HIC
Consider I've given your an idea for the next blog post. Looking forward to reading it soon...