17 Replies Latest reply: Feb 25, 2015 11:59 AM by Robert Hutchings

# A Primer on Set Analysis Blog Post

A Primer on Set Analysis

And as no comments are posted. Is this a bug or are comments not allowed now

A Primer on Set Analysis

Set analysis is one of the more complex things you can define in QlikView or Qlik Sense.

Agree.

I still use 'IF' when I can at times because its simple whereas Set analysis is confusing  when \$'s and double quotes are used .(why its been done this way I'm not too sure)

For example

Sum( {\$<Date={"<=\$(=Max(Date))"}>} Sales)

compared to Sum(if (Date <= Max (Date) , Sales))

One is straight forward. The other very confusing (this should be simplified by Qlik to

Sum( {\$<Date <=  {Max(Date)}>} Sales)

A question though

Is *= used to allow drill down

"If" allows drill down (for a filter selection contained in the expression)

eg  sum ( if Cust_Num = 1234 or Cust_num = 1235, Sales)

Does Sales ({<Cust_Num *= {1234,1235}>} Sales

only show sales for 1234 if the user makes a selection of 1234 (it seems to in a number of examples I have tested it on)

Thanks

• ###### Re: A Primer on Set Analysis Blog Post

Hi RJ, in "({<Cust_Num *= {1234,1235}>} Sales" *= means intersection, it is what values are selected by user and in this values (1234, 1235)

• ###### Re: A Primer on Set Analysis Blog Post

Thanks

But it means that a user can drill down ie selections work for Cust_Num whereas without the * it doesn't work for Cust_Num

So if the expression {1234,1235} intersects with the selection made either no selection for Cust_Num or either 1234 or 1235 then the total is given for either 1234 or 1235

This is the same as using if (I think).

• ###### Re: A Primer on Set Analysis Blog Post

Ok, I think I understand now your question, your point is that anything that is done with set analysis can be done with if?

As Henric answers, the main (an possibly the only) reason to search for a solution using set analysis is because of performance, as a quick example:

If you use 'set analysis': it is calculated one time for the whole chart, it returns the filtered data, and based on that data it calculates the graph.

If you use an 'If': QV has to make the 'if' comparison for each combination of dimension values

• ###### Re: A Primer on Set Analysis Blog Post

Set Analysis also allows you to make aggregations outside the set of possible records. For instance, if you select Month='May' you can make a Set Analysis expression that calculates the sales up until May, i.e. Jan-May. This is not possible with If() since Jan-Apr is excluded by the selection.

HIC

• ###### Re: A Primer on Set Analysis Blog Post

Yes, I know it has a lot of functionality (altern states, bookmarks, the "1-\$", E(), modifier operators...), what I tried to say is:

If you have something working with an 'if' solution, sometimes you look for a 'set analysis' solution because of performance.

I take this ocassion to thank you for all the very good posts on design blog.

• ###### Re: A Primer on Set Analysis Blog Post

No I have tried to convert all of my 'IF' expressions to set analysis

But with some of the more complex ones (using > etc. I didn't because I could not get them to work. Whereas 'IF'  worked as required

BUT one thing I did not like with set analysis (as opposed to if) is the lack of a drill down feature

Example

Sum( {\$<Date =  {Max(Date)}>} Sales)

If I wanted to filter (make  a selection) of Date = 10/01/2015. It wouldn't work because date is already set to Max(Date)

But if

Sum( {\$<Date *=  {Max(Date)}>} Sales)

is used it seems to overcome this issue

I have only used this in a few examples recently (once I found out about  *=) but am unsure if I should be using it in this way.

Hope this makes sense

• ###### Re: A Primer on Set Analysis Blog Post

You are right that an If() function inside the aggregation function is simpler and easier to understand. Such an expression is evaluated row by row, and this makes it conceptually simpler. But this is also the drawback: It is slow, if you have large data amounts.

So when the Set Analysis was designed, the main goal was to make it fast. The solution was to make a selection "internal" to the aggregation function, so that you before the aggregation is calculated already have a binary vector pointing out the records to be included. But this also makes the Set Analysis conceptually complex.

On the syntax: Date <= {Max(Date)} is ambiguous, since you could have a field value 'Max(Date)', with brackets and everything. Further, if we use Date <= <something> as example, this means that QlikView internally first creates a For-Next loop over the distinct values of Date and for each date makes a comparison. But then you cannot use Date <= Max(Date) since the maximum inside the scope of the loop would be the same as the date. So you need to calculate Max(Date) in a larger scope than inside the For-Next loop. Hence, you need to calculate it before the For-Next loop is evaluated. This is why you need a dollar expansion.

Similar arguments can be made for most of the brackets inside the Set Analysis expression. As a result, the syntax is complex. But I frankly don't see that you can remove any of the brackets without getting logical problems or ambiguities.

HIC

• ###### Re: A Primer on Set Analysis Blog Post

Thanks.

Luckily for me speed is still OK with  IF (its not  a huge database)

As I times I have tried to use a complex set analysis and have given up. Reverting to IF or a Max(date) in script.

• ###### Re: A Primer on Set Analysis Blog Post

As an example

This gives the correct result

Count

({< FSRNum2 = {1},CalTStdPM = {STD} ,Group = {Calls} >}
distinct (if ( ((Call_RDate)) >= ( VisMinStartRef)
,CallSerial)))

But this doesn't

Count     ({\$< FSRNum2 = {1},CalTStdPM = {STD} ,Group = {Calls},
Call_RDate = {" >= VisMinStartRef " }>}     distinct CallSerial)

I can not work out why?

• ###### Re: A Primer on Set Analysis Blog Post

Henric,

The expression below can work only if field Date formatted as a plain number, because max() returns a number, and in this case it is "format-sensitive":

Sum( {\$<Date={"<=\$(=Max(Date))"}>} Sales)

I always use date() function for this:

Sum( {\$<Date={"<=\$(=date(Max(Date)))"}>} Sales)

Regards,
Michael

• ###### Re: A Primer on Set Analysis Blog Post

You are absolutely right, Michael. I just didn't want to clutter the expression with too many functions...

Maybe the format-gnostic set analysis expression should be explained in its own blog post?

HIC

• ###### Re: A Primer on Set Analysis Blog Post

Consider I've given your an idea for the next blog post.  Looking forward to reading it soon...

• ###### Re: A Primer on Set Analysis Blog Post

Hi Henric,

great post, as usual.

Did i well understand ? Simple quotes should never be used with set analysis.

If i want to make a simple filter on a text field, i must write :

Sum( {\$<Country="France">} Sales) and not Sum( {\$<Country='France'>} Sales), correct ?

as you wrote

Single quotes should denote literals, i.e. explicit field values.

i'm a bit confused.

Best regards,

Johann

• ###### Re: A Primer on Set Analysis Blog Post

In your case, both will probably work. The single quotes should be a case-sensitive match and the double should be a case-insensitive search.

But as soon as you insert wild cards it will matter.

Consider the following: {\$<Country={"Fr*"}>} vs {\$<Country={'Fr*'}>}. Here the "Fr*" means a search and the * is a wild card. But since single quotes is a literal, 'Fr*' will match the exact combination of 'Fr*', i.e. three characters where the last is an asterisk.

HIC

• ###### Re: A Primer on Set Analysis Blog Post

so is there any performance boost to use single quotes when you actually want to use literals ? ie.

Sum{<MyField={'<field value here'}>} value)

Sum{<MyField={"<field value here"}>} value)

best regards.

Johann

• ###### Re: A Primer on Set Analysis Blog Post

Hi Johann,

I"m not Henric but can answer this one...

'France' is a filed value, hence it must be in the single quotes.  You second example is almost correct.  It should be:

Sum({\$<Country={'France'}>} Sales)

Regards,

Michael

• ###### Re: A Primer on Set Analysis Blog Post

Thanks michael !