Qlik Community

Qlik Design Blog

All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.

Why is it called Set Analysis?

Set Analysis is a commonly used tool when creating advanced formulas in Qlik Sense or QlikView. But why is it called “Set Analysis”? And why are there so many curly brackets?

Set Analysis is really a simple and straightforward tool, but with an advanced syntax. “Simple” since its only function is to allow you to define a calculation scope different from the current selection. The basics are very much like defining a SQL Where clause.

 

A different term for the scope is the “record set”. With this, we mean the set of records in the data over which the aggregations should be calculated.

 

Image4.png

 

The above Set analysis expression is equivalent to selecting 2015 from OrderYear. It is also equivalent to a SQL condition “Where OrderYear=2015”.

 

The Set analysis syntax is advanced because there are so many different ways to define a record set; you can use a selection, a bookmark, or an alternate state. Further, just as in any set algebra, you can define intersections, unions and other set operations, e.g. the difference between two sets:

 

Image1.png

 

This expression describes the Set difference between current selection in the default state $ and selection in the alternate state State2: I.e. records that are included by the default state, but not included in State2.

 

Venn.png

 

Another set expression is the following:

 

Image3.png

 

This defines the records belonging to any order date that occurs before the latest order date in the current selection. The dollar expansion is needed, since the parser expects field values in clear text in the search string.

 

Note that the last Set expression contains two pairs of curly brackets. Set analysis expressions with many curly brackets look complicated, but there is in fact method in the madness…

 

  • The outer brackets: Sum( { … } Sales )
    These define the record set; i.e. the records in the data over which the aggregation should be made.
  • The inner brackets: Sum( {$<Field= { … } >} Sales )
    These define the element set; i.e. the set of individual field values that define the selection in this field.

 

For those of you who have read about Symbol Tables and Bit-Stuffed Pointers it may be interesting to know that the element set operate on the symbol tables, whereas the record set is defined on the data tables.

 

The element set can be a list of values or a search string. It can also be described by field references or by the P() and E() functions, e.g.

 

Image5.png

 

Summary:

  • There are two different sets in a Set expression: The Record Set and the Element Set.
  • The record set (the outer curly brackets) defines the records in the data tables. Use 1, $, a bookmark or an alternate state as Set identifier.
  • The element set (the inner curly brackets) defines the selection that modifies a record set. Use one or several of the following to define your element set:
    • A list of values, e.g: {2013,2014,2015}
    • A search, e.g: {">=2015-01-06<=2015-02-05"}
    • A field reference
    • A set function, i.e: P() or E()
  • Set operators can be used to define both the record set and the element set.

 

With this, I hope that you understand Set Analysis somewhat better.

 

HIC

 

Further reading related to this topic:

A Primer on Set Analysis

Symbol Tables and Bit-Stuffed Pointers

Implicit Set Operators

25 Comments
datanibbler
Esteemed Contributor

Hi,

great post! Set_analysis is really one of the most powerful tools there are in QlikView and thus well worth exploring.

The only drawback - but that would be for Qlik to fix - is that the expression_editor often does not recognize SET_analysis_syntax and underlines it in red. That is a bit annoying, especially if there are another two or three expressions following because the editor doesn't even check those.

Is there any chance that qlik is going to fix that? (maybe it already has, we don't update that often)

There are alternatives - with less elegant expressions that the editor does recognize.

1,466 Views
SreeniJD
Valued Contributor

Hi HIC,

Nice post. Keep update this post as it deserves to have more examples and enhanced versions.

Sreeni

0 Likes
1,466 Views
Not applicable

Nice Explanation!! Very helpful

0 Likes
1,466 Views
qlikviewnovice
Valued Contributor II

Simple and clear explanation

0 Likes
1,466 Views
Arjunarao
Honored Contributor II

Hi hic,

Thanks for the detailed explanation on set analysis.

Could you tell me what this {$<Customer=Customer+P({1<OrderYear={2015}>}Customer)>} set analysis will show the data?

0 Likes
1,466 Views
Partner
Partner

Update to the latest version and you will see improvements of the editors understanding of set analysis..

0 Likes
1,466 Views
Partner
Partner

The currently selected Customers plus those Customers with orders in 2015.

1,466 Views
MVP & Luminary
MVP & Luminary

‌I disagree that the latest update (sr12) improves the editor syntax checking. In fact, I think sr11 makes it more difficult.

SR11 Error Handling Changes | Qlikview Cookbook

It looks like QLikView Version 12, when it becomes available, will have much improved syntax checking.

1,466 Views
Partner
Partner

Yes read your blog, but I think there are some improvements also to the syntax check, at least in SR12.

Probably there are both and you should try which you like the best.

Or hope for version 12 to sort it out.

0 Likes
1,466 Views
Not applicable

Very helpful TNX

0 Likes
1,466 Views
nizamsha
Valued Contributor II

Great Post ,

can we get how and all we can write set  Analysis each and every corner , we dont know or else we are writing but dont know how its working because some time i faced this issue

0 Likes
1,466 Views
Partner
Partner

Nice post! Set Analysis is a powerful tool indeed. But I must say I use it with caution. First of all I think Set Analysis is a bit difficult to read and to explain. That’s why I always keep my Set Analysis statements as simple as possible. To achieve this I use flag fields indicating a certain state of a field (like [Previous Year], [Current Year], etc.). Secondly, many Set Analysis ‘solutions’ can / should be solved in the script / data model. IMHO Set Analysis opens a door in solving problems with the data model that should be solved on the script side.

Cheers, Michiel

0 Likes
1,466 Views

Michiel

I absolutely agree. One question you always should ask yourself is: "Is this condition dynamic? Or can it be calculated already in the script?"

If it can be calculated in the script, it should be calculated in the script.

But there are many conditions that are dynamic, i.e. you want the user selection to influence the result. And then Set Analysis is the solution.

HIC

1,466 Views
neetha_p
Honored Contributor

Hi Henric,

Great!!!

Thanks for Explaining in detail,very useful.

Regards

Neetha

0 Likes
1,466 Views
Not applicable

Great post!!

Thanks

0 Likes
1,466 Views
Not applicable

Clear post very helpful.

Thanks.

0 Likes
1,466 Views
mayankraoka
Valued Contributor

too good explanation

0 Likes
1,466 Views
Not applicable

Mr.Henric cronstrom

                              will you help me to know clear explanation about Incremental load and section access.

                                 please............

0 Likes
1,466 Views
Not applicable

Hello hic,

Should you see this comment, or any other follower with suitable knowledge:

I need to adjust the MAX expressions you wrote in your original post to an aggregate expression which I have for my DB.

My query is in the line chart is:

Aggr (Sum (Q1),[Day\Month\Year],QT1)

Where Q1 is daily incoming value which are entered the DB and viewed in different level (by client,by city, by state and by type of client = QT1).  These accumulated data is shown in a line chart (again, it can be presented by client type and\or aggr by city\country or only by client).

Now, What I am looking for is, when I select a flexible period (week, two month, year and so on) to present in a text field the value (aggr or not) of the selected period.

For example if I select the period 7.4.15 to 18.6.15 the chart show me, aggr data of the income according to my list selection ( I can select one client or a client type from 2 cities or a client type from a certain country) and I want to show in the text field the aggr sum qty for the 18.6.15

I I put the MAX ahead of my expression (Max (aggr (sum.....) it select the max value and not the latest vale as I wish.

If I write the following (without the aggr):

SUM ({$<[Day\Month\Year]={"<$(=[Day\Month\Year] max [Day\Month\Year] )))"}>}Q1   - it only sum the selected values, it does not work.


Your help will be most appreciated!


Nir

0 Likes
1,466 Views
Partner
Partner

You want the sum for the last date in the selected period?

Your last expression is on the right way but your set analysis for [Day\Month\Year] look real strange?

What on earth is this supposed to do: $(=[Day\Month\Year] max [Day\Month\Year] )

That will not execute and your expression will ignore the set analysis on [Day\Month\Year].

You need something along this;

SUM ({$<[Day\Month\Year]={"$(=Max([Day\Month\Year]))"}>} Q1)

Depending on how your fields are defined you might need to format the output of Max() since the set analysis is a text comparison;

SUM ({$<[Day\Month\Year]={"$(=Date(Max([Day\Month\Year])), 'dd\MM\yyyy')"}>} Q1)

or whatever your date format is in that field.

0 Likes
1,466 Views
Not applicable

Hi Henric Sir,

The dollar expansion is needed, since the parser expects field values in clear text in the search string.

Kindly explain the meaning of above sentence.

Best Regards

Tirupati Rao

0 Likes
1,466 Views

It's quite simple: When interpreting the expression, the software expects a field value where the dollar-expansion is written. For example, below the date (in bold) is a field value:

  Sum({$<Date={"2016-01-01"}>} Sales)

This expression will work fine.

This means that the following will not work:

  Sum({$<Date={"Date(Max(Date))"}>} Sales)

The engine will in this case look for a field value containing this exact string: Date(Max(Date)). It will not see this as an expression that should be evaluated.

But if you instead wrap the calculation in a dollar expansion, it will work.

  Sum({$<Date={"$(=Date(Max(Date)))"}>} Sales)

The dollar expansion will then be calculated before the rest of the expression.

HIC

1,466 Views
Arjunarao
Honored Contributor II

Search in Community, you can find lot of threads.

0 Likes
1,466 Views
narayanareddyma
New Contributor

Thanks Henric! Very useful

0 Likes
1,466 Views
marjph
New Contributor II

Thank you for this! 

0 Likes
1,338 Views