Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
hic
Former Employee
Former Employee

Set Analysis is a commonly used tool when creating advanced formulas in Qlik Sense or QlikView. But why is it called “Set Analysis”? And why are there so many curly brackets?

Set Analysis is really a simple and straightforward tool, but with an advanced syntax. “Simple” since its only function is to allow you to define a calculation scope different from the current selection. The basics are very much like defining a SQL Where clause.

 

A different term for the scope is the “record set”. With this, we mean the set of records in the data over which the aggregations should be calculated.

 

Image4.png

 

The above Set analysis expression is equivalent to selecting 2015 from OrderYear. It is also equivalent to a SQL condition “Where OrderYear=2015”.

 

The Set analysis syntax is advanced because there are so many different ways to define a record set; you can use a selection, a bookmark, or an alternate state. Further, just as in any set algebra, you can define intersections, unions and other set operations, e.g. the difference between two sets:

 

Image1.png

 

This expression describes the Set difference between current selection in the default state $ and selection in the alternate state State2: I.e. records that are included by the default state, but not included in State2.

 

Venn.png

 

Another set expression is the following:

 

Image3.png

 

This defines the records belonging to any order date that occurs before the latest order date in the current selection. The dollar expansion is needed, since the parser expects field values in clear text in the search string.

 

Note that the last Set expression contains two pairs of curly brackets. Set analysis expressions with many curly brackets look complicated, but there is in fact method in the madness…

 

  • The outer brackets: Sum( { … } Sales )
    These define the record set; i.e. the records in the data over which the aggregation should be made.
  • The inner brackets: Sum( {$<Field= { … } >} Sales )
    These define the element set; i.e. the set of individual field values that define the selection in this field.

 

For those of you who have read about Symbol Tables and Bit-Stuffed Pointers it may be interesting to know that the element set operate on the symbol tables, whereas the record set is defined on the data tables.

 

The element set can be a list of values or a search string. It can also be described by field references or by the P() and E() functions, e.g.

 

Image5.png

 

Summary:

  • There are two different sets in a Set expression: The Record Set and the Element Set.
  • The record set (the outer curly brackets) defines the records in the data tables. Use 1, $, a bookmark or an alternate state as Set identifier.
  • The element set (the inner curly brackets) defines the selection that modifies a record set. Use one or several of the following to define your element set:
    • A list of values, e.g: {2013,2014,2015}
    • A search, e.g: {">=2015-01-06<=2015-02-05"}
    • A field reference
    • A set function, i.e: P() or E()
  • Set operators can be used to define both the record set and the element set.

 

With this, I hope that you understand Set Analysis somewhat better.

 

HIC

 

Further reading related to this topic:

A Primer on Set Analysis

Symbol Tables and Bit-Stuffed Pointers

Implicit Set Operators

25 Comments
nizamsha
Specialist II
Specialist II

Great Post ,

can we get how and all we can write set  Analysis each and every corner , we dont know or else we are writing but dont know how its working because some time i faced this issue

0 Likes
2,733 Views
mvanlutterveld
Partner - Creator II
Partner - Creator II

Nice post! Set Analysis is a powerful tool indeed. But I must say I use it with caution. First of all I think Set Analysis is a bit difficult to read and to explain. That’s why I always keep my Set Analysis statements as simple as possible. To achieve this I use flag fields indicating a certain state of a field (like [Previous Year], [Current Year], etc.). Secondly, many Set Analysis ‘solutions’ can / should be solved in the script / data model. IMHO Set Analysis opens a door in solving problems with the data model that should be solved on the script side.

Cheers, Michiel

0 Likes
2,733 Views
hic
Former Employee
Former Employee

Michiel

I absolutely agree. One question you always should ask yourself is: "Is this condition dynamic? Or can it be calculated already in the script?"

If it can be calculated in the script, it should be calculated in the script.

But there are many conditions that are dynamic, i.e. you want the user selection to influence the result. And then Set Analysis is the solution.

HIC

2,752 Views
Anonymous
Not applicable

Hi Henric,

Great!!!

Thanks for Explaining in detail,very useful.

Regards

Neetha

0 Likes
2,752 Views
Not applicable

Great post!!

Thanks

0 Likes
2,752 Views
Not applicable

Clear post very helpful.

Thanks.

0 Likes
2,752 Views
mayankraoka
Specialist
Specialist

too good explanation

0 Likes
2,752 Views
Not applicable

Mr.Henric cronstrom

                              will you help me to know clear explanation about Incremental load and section access.

                                 please............

0 Likes
2,752 Views
Not applicable

Hello hic,

Should you see this comment, or any other follower with suitable knowledge:

I need to adjust the MAX expressions you wrote in your original post to an aggregate expression which I have for my DB.

My query is in the line chart is:

Aggr (Sum (Q1),[Day\Month\Year],QT1)

Where Q1 is daily incoming value which are entered the DB and viewed in different level (by client,by city, by state and by type of client = QT1).  These accumulated data is shown in a line chart (again, it can be presented by client type and\or aggr by city\country or only by client).

Now, What I am looking for is, when I select a flexible period (week, two month, year and so on) to present in a text field the value (aggr or not) of the selected period.

For example if I select the period 7.4.15 to 18.6.15 the chart show me, aggr data of the income according to my list selection ( I can select one client or a client type from 2 cities or a client type from a certain country) and I want to show in the text field the aggr sum qty for the 18.6.15

I I put the MAX ahead of my expression (Max (aggr (sum.....) it select the max value and not the latest vale as I wish.

If I write the following (without the aggr):

SUM ({$<[Day\Month\Year]={"<$(=[Day\Month\Year] max [Day\Month\Year] )))"}>}Q1   - it only sum the selected values, it does not work.


Your help will be most appreciated!


Nir

0 Likes
2,771 Views
anderseriksson
Partner - Specialist
Partner - Specialist

You want the sum for the last date in the selected period?

Your last expression is on the right way but your set analysis for [Day\Month\Year] look real strange?

What on earth is this supposed to do: $(=[Day\Month\Year] max [Day\Month\Year] )

That will not execute and your expression will ignore the set analysis on [Day\Month\Year].

You need something along this;

SUM ({$<[Day\Month\Year]={"$(=Max([Day\Month\Year]))"}>} Q1)

Depending on how your fields are defined you might need to format the output of Max() since the set analysis is a text comparison;

SUM ({$<[Day\Month\Year]={"$(=Date(Max([Day\Month\Year])), 'dd\MM\yyyy')"}>} Q1)

or whatever your date format is in that field.

0 Likes
2,771 Views