Skip to main content
hic
Former Employee
Former Employee

The Set Analysis is a commonly used tool when creating static or dynamic filters inside aggregations in Qlik Sense or QlikView. But sometimes you need the opposite to selecting – you need to exclude values. How can this be done?

In Set Analysis, it is straightforward to make selections; to define the criteria for inclusion. But it is not as straightforward to define an exclusion. But there are in fact several ways to do this.

 

First of all, an inclusion can be defined the following way:

 

Formula0.png

 

This expression is equivalent to saying “Sum the Amounts where Field equals X”.

 

But if you want to say the opposite – “where field does not equal X” – it becomes more complicated. The relation “not equal to” is not a Set operation. However, one way to do this is to use the implicit exclusion operator:

 

Formula1.png

 

Note the minus sign in front of the equals sign. This will create an element set based on the existing selected values, but with the value X removed.

 

A second way to do this is to use the unary exclusion operator:

 

Formula2.png

 

This will return the complement set of X and use this as element set in the set expression.

 

In many situations the two methods return identical sets. But there are cases when they are different. In the table below you can see that it makes a difference if there already is a selection in the field.

 

Image5.png

 

Note the difference on the row with the C. Since the implicit operator (the 2nd expression) is based on current selection, also the value C is excluded from the calculation. This is in contrast to the unary operator (the 3rd expression) that creates a completely new element set, not based on current selection.

 

We can also see that both of the above expressions exclude records where the field is NULL (the line where Field has a dash only). The reason is simple: As soon as there is a selection in the field, the logical inference will exclude NULL in the same field.

 

So what should you do if you want to exclude X but not NULL?

 

The answer is simple: Use another field for your selection. Typically you should use the primary key for the table where you find the Amount.

 

Formula3.png

 

Here you need to use the element function E(), that returns excluded values. Hence, the above Set expression says: “Select the IDs that get excluded when selecting X.”

 

Image7.png

 

The table confirms that we get the records we want: X is excluded but NULL is still included.

 

With this, I hope that you understand Set Analysis somewhat better.

 

HIC

 

Further reading related to this topic:

A Primer on Set Analysis

Why is it called Set Analysis?

Finding NULL

40 Comments
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

I have to add a comment to this one...

One of the most useful Set operators, I feel, is one that tends to be publicised the least:

=sum({<Region*={'South'}>}Amount)

This gives you the Amount, for the South, as long as it is included in the Current Selections.  If you select North in the Region field, the 'default' expression of:

=sum({<Region={'South'}>}Amount)

will still blindly give the value for the South in that column, totally disregarding (and confusing) the user.

Now, this forcing of a selection in a set is absolutely essential for prior period comparisons etc., but there are lots of cases where it does not provide intuitive output.

So, now to my point...  why, why, why does the syntax highlighter blow up when you use *= in a Set statement?  Is there something I am doing wrong in my expression, or is this a glitch that has been overlooked for many versions?

Thanks, as ever, for the insights hic‌ .

Steve

13,362 Views
Not applicable

I think you can also use this syntax Steve (this is how I achieve the same behaviour, and I think the syntax highlighter is ok with it):

=sum({$*<Region={'South'}>}Amount)

13,362 Views
anderseriksson
Partner - Specialist
Partner - Specialist

When it comes to set analysis the syntax highlighter is really bad.

Hopes are it will get a lot better in version 12.

13,161 Views
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Yes Graeme - but this doesn't give you the granularity of applying to one field and not another?

13,161 Views
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Indeed Anders, there are a number of ways to break it - as soon as you add a variable for one.

Looking at the syntax highlighting in Sense will give an indication - as it is the Sense engine that is going to be behind QlikView 12, and therefore potentially the syntax highlighting algorithms.

0 Likes
13,161 Views
mayankraoka
Specialist
Specialist

Good explanation.....

0 Likes
13,161 Views
Not applicable

stevedark Oh... that was missing from the requirements doc/examples! 


Agreed though - *= gives you more granularity within construction of the sets than $*<> (at the cost of the syntax highlighting).

13,161 Views
JonasValleskog
Partner - Creator
Partner - Creator

Steve:

Have you tried in 11.20 SR12? I'm not seeing any obvious issue with the syntax highlighter... Can you send a screen of the issue?

Graeme:

Just a quick tip on performance:

sum({$*<Region={'South'}>}Amount)

vs

sum({$<Region*={'South'}>}Amount)


Even if the results are the same in this simple syntax implementation, I would not use the $*< syntax as it is less descriptive of the intersect you are after - I.e. you've given the engine less information to ensure the intersect is identified in the most optimized way possible. In the first instance you're saying "intersect my current selections with another selection defined by all of my current selections where Region has been overridden to 'South'." in the second example you're saying "Relative to my current selection, intersect my Regions selection with 'South'". I don't know the exact way in which the selection engine carries out each of these operations as it's very much a 'black box' still but an educated guess would be that the first syntax will cost more to resolve for the selection engine.


HIC:

1. Any corrections on my educated guesses above on selection engine computational cost?

2. Idea: How about some transparency into computation algorithms in a future version of QV/QS by EXPLAIN PLAN like information breaking out estimated effort by the different computational steps involved in resolving an expression? Not a mainstream requirement, I know, but I think you'd get a major thumbs up from the seasoned developer community for such a feature.


Regards

Jonas

0 Likes
13,161 Views
Gysbert_Wassenaar

In addition to what Steve said it also refers to the default set. When you use alternate states in your document the default set for an object is not necessarily the same as the inherited set.

0 Likes
13,182 Views
Not applicable

jonasvalleskog

Hi Jonas,

To be honest, I'd never really had reason to give it that much thought - but I 100% agree with your reasoning and logic.  I knocked up a quick sample app, and the calc times (I know they aren't perfect) do appear to support your theory (although they are a little erratic).  I will try it in some real world examples when I get a chance and see if there is a noticeable improvement.  Thanks a lot for sharing your insight.  Would be really interested to hear from HIC regarding the internal mechanics of the set execution inside QV, and I really love the idea about the execution/explain plan for expressions!

Regards,

Graeme

0 Likes
13,182 Views