Set Analysis is a commonly used tool when creating advanced formulas in Qlik Sense or QlikView. But why is it called “Set Analysis”? And why are there so many curly brackets?


Set Analysis is really a simple and straightforward tool, but with an advanced syntax. “Simple” since its only function is to allow you to define a calculation scope different from the current selection. The basics are very much like defining a SQL Where clause.

 

A different term for the scope is the “record set”. With this, we mean the set of records in the data over which the aggregations should be calculated.

 

Image4.png

 

The above Set analysis expression is equivalent to selecting 2015 from OrderYear. It is also equivalent to a SQL condition “Where OrderYear=2015”.

 

The Set analysis syntax is advanced because there are so many different ways to define a record set; you can use a selection, a bookmark, or an alternate state. Further, just as in any set algebra, you can define intersections, unions and other set operations, e.g. the difference between two sets:

 

Image1.png

 

This expression describes the Set difference between current selection $ and the alternate state State2: I.e. records that are included by the current selection, but not included in State2.

 

Venn.png

 

Another set expression is the following:

 

Image3.png

 

This defines the records belonging to any order date that occurs before the latest order date in the current selection. The dollar expansion is needed, since the parser expects field values in clear text in the search string.

 

Note that the last Set expression contains two pairs of curly brackets. Set analysis expressions with many curly brackets look complicated, but there is in fact method in the madness…

 

  • The outer brackets: Sum( { … } Sales )
    These define the record set; i.e. the records in the data over which the aggregation should be made.
  • The inner brackets: Sum( {$<Field= { … } >} Sales )
    These define the element set; i.e. the set of individual field values that define the selection in this field.

 

For those of you who have read about Symbol Tables and Bit-Stuffed Pointers it may be interesting to know that the element set operate on the symbol tables, whereas the record set is defined on the data tables.

 

The element set can be a list of values or a search string. It can also be described by field references or by the P() and E() functions, e.g.

 

Image5.png

 

Summary:

  • There are two different sets in a Set expression: The Record Set and the Element Set.
  • The record set (the outer curly brackets) defines the records in the data tables. Use 1, $, a bookmark or an alternate state as Set identifier.
  • The element set (the inner curly brackets) defines the selection that modifies a record set. Use one or several of the following to define your element set:
    • A list of values, e.g: {2013,2014,2015}
    • A search, e.g: {">=2015-01-06<=2015-02-05"}
    • A field reference
    • A set function, i.e: P() or E()
  • Set operators can be used to define both the record set and the element set.

 

With this, I hope that you understand Set Analysis somewhat better.

 

HIC

 

Further reading related to this topic:

A Primer on Set Analysis

Symbol Tables and Bit-Stuffed Pointers

Implicit Set Operators