Discussion Board for collaboration related to QlikView App Development.
Hi,
What is Set Analysis? And what is the advantage of using it?
Regards,XXX
Sets can be used in aggregation functions. Aggregation functions normally aggregate over the set of possible records defined by the current selection. But an alternative set of records can be defined by a set expression. Hence, a set is conceptually similar to a selection.
A set expression always begins and ends with curly brackets when used, e.g. {BM01}.
There is one constant that can be used to denote a record set; 1. It represents the full set of all the records in the application.
The $ sign represents the records of the current selection. The set expression {$} is thus the equivalent of not stating a set expression. {1-$} is all the more interesting as it defines the inverse of the current selection, i.e. everything that the current selection excludes.
Selections from the Back/Forward stack can be used as set identifiers, by use of the dollar symbol: $1 represents the previous selection, i.e. equivalent to pressing the Back button. Similarly, $_1 represents one step forward, i.e. equivalent to pressing the Forward button. Any unsigned integer can be used in the Back and Forward notations, i.e. $0 represents the current selection..
Finally, bookmarks can be used as set identifiers. Note that only server and document bookmarks can be used as set identifiers. Either the bookmark ID or the bookmark name can be used, e.g. BM01 or MyBookmark. Only the selection part of a bookmark is used. The values are not included. It is thus not possible to use input fields in bookmarks for set analysis.
sum( {$} Sales )
returns sales for the current selection, i.e. the same as sum(Sales).
sum( {$1} Sales )
returns sales for the previous selection.
sum( {$_2} Sales )
returns sales for the 2nd next selection, i.e. two steps forward. Only relevant if you just made two Back operations.
sum( {1} Sales )
returns total sales within the application, disregarding the selection but not the dimension. If used in a chart with e.g. Products as dimension, each product will get a different value.
sum( {1} Total Sales )
returns total sales within the application, disregarding both selection and dimension. I.e. the same as sum(All Sales).
sum( {BM01} Sales )
returns sales for the bookmark BM01.
sum( {MyBookMark} Sales )
returns sales for the bookmark MyBookMark.
sum({Server\BM01} Sales)
returns the sales for the server bookmark BM01.
sum({Document\MyBookmark}Sales)
returns the sales for the document bookmark MyBookmark.
Several set operators that can be used in set expressions exist. All set operators use sets as operands, as described above, and return a set as result.
+ Union. This binary operation returns a set consisting of the records that belong to any of the two set operands.
- Exclusion. This binary operation returns a set of the records that belong to the first but not the other of the two set operands. Also, when used as a unary operator, it returns the complement set.
* Intersection. This binary operation returns a set consisting of the records that belong to both of the two set operands.
/ Symmetric difference (XOR). This binary operation returns a set consisting of the records that belong to either, but not both of the two set operands.
The order of precedence is 1) Unary minus (complement), 2) Intersection and Symmetric difference, and 3) Union and Exclusion. Within a group, the expression is evaluated from left to right. Alternative orders can be defined by standard brackets, which may be necessary since the set operators do not commute, e.g. A+(B-C) is different from (A+B)-C which in turn is different from (A-C)+B.
sum( {1-$} Sales )
returns sales for everything excluded by the current selection.
sum( {$*BM01} Sales )
returns sales for the intersection between the current selection and bookmark BM01.
sum( {-($+BM01)} Sales )
returns sales excluded by current selection and bookmark BM01.
Note
The use of set operators in combination with basic aggregation expressions involving fields from multiple QlikView tables may cause unpredictable results and should be avoided. E.g. if Quantity and Price are fields from different tables, then the expression sum({$*BM01}Quantity*Price) should be avoided.
A set can be modified by an additional or a changed selection. Such a modification can be written in the set expression. The modifier consists of one or several field names, each followed by a selection that should be made on the field, all enclosed by < and >. E.g. <Year={2007,+2008},Region={US}>. Field names and field values can be quoted as usual, e.g. <[Sales Region]={'West coast', 'South America'}>.
There are several ways to define the selection: A simple case is a selection based on the selected values of another field, e.g. <OrderDate = DeliveryDate>. This modifier will take the selected values from DeliveryDate and apply those as a selection on OrderDate. If there are many distinct values - more than a couple of hundred - then this operation is CPU intense and should be avoided.
The most common case, however, is a selection based on a field value list enclosed in curly brackets, the values separated by commas, e.g. <Year = {2007, 2008}>. The curly brackets here define an element set, where the elements can be either field values or searches of field values. A search is always defined by the use of double quotes, e.g. <Ingredient = {"*Garlic*"}> will select all ingredients including the string 'garlic'. Searches are case-insensitive and are made also over excluded values.
Empty element sets, either explicitly e.g. <Product = {}> or implicitly e.g. <Product = {"Perpetuum Mobile"}> (a search with no hits) mean no product, i.e. they will result in a set of records that are not associated with any product. Note that this set cannot be achieved through usual selections, unless a selection is made in another field, e.g. TransactionID.
Finally, for fields in and-mode, there is also the possibility of forced exclusion. If you want to force exclusion of specific field values, you will need to use "~" in front of the field name.
A set modifier can be used on a set identifier or on its own. It cannot be used on a set expression. When used on a set identifier, the modifier must be written immediately after the set identifier, e.g. {$<Year = {2007, 2008}>}. When used on its own, it is interpreted as a modification of the current selection.
sum( {1<Region= {US} >} Sales )
returns the sales for region US disregarding the current selection.
sum( {$<Region = >} Sales )
returns the sales for the current selection, but with the selection in "Region" removed.
sum( {<Region = >} Sales )
returns the same as the example immediately above. When the set to modify is omitted, $ is assumed.
Note!
The syntax in the two previous examples is interpreted as "no selections" in "Region", i.e. all regions given other selections will be possible. It is not equivalent to the syntax <Region = {}> (or any other text on the right side of the equal sign implicitly resulting in an empty element set) which is interpreted as no region.
sum( {$<Year = {2000}, Region = {US, SE, DE, UK, FR}>} Sales )
returns the sales for current selection, but with new selections both in "Year" and in "Region".
sum( {$<~Ingredient = {"*garlic*"}>} Sales )
returns the sales for current selection, but with a forced exclusion of all Ingredients containing the string 'garlic'.
sum( {$<Year = {"2*"}>} Sales )
returns the sales for current selection, but with all years beginning with the digit "2", i.e. most likely year 2000 and onwards, selected in the field "Year".
sum( {$<Year = {"2*","198*"}>} Sales )
as above, but now also the 1980:s are included in the selection.
sum( {$<Year = {">1978<2004"}>} Sales )
as above, but now with a numeric search so that an arbitrary range can be specified.
The selection within a field can be defined using set operators as described above, working on different element sets. E.g. the modifier <Year = {"20*", 1997} - {2000}> will select all years beginning with "20" in addition to "1997", except for "2000".
sum( {$<Product = Product + {OurProduct1} - {OurProduct2} >} Sales )
returns the sales for the current selection, but with the product "OurProduct1" added to the list of selected products and "OurProduct2" removed from the list of selected products.
sum( {$<Year = Year + ({"20*",1997} - {2000}) >} Sales )
returns the sales for the current selection but with additional selections in the field "Year": 1997 and all that begin with "20" - however, not 2000. Note that if 2000 is included in the current selection, it will still be included after the modification.
sum( {$<Year = (Year + {"20*",1997}) - {2000} >} Sales )
returns almost the same as above, but here 2000 will be excluded, also if it initially is included in the current selection. The example shows the importance of sometimes using brackets to define an order of precedence.
sum( {$<Year = {"*"} - {2000}, Product = {"*bearing*"} >} Sales )
returns the sales for the current selection but with a new selection in "Year": all years except 2000; and only for products containing the string 'bearing'.
The above notation defines new selections, disregarding the current selection in the field. However, if you want to base your selection on the current selection in the field and add field values, e.g. you may want a modifier <Year = Year + {2007, 2008}>. A short and equivalent way to write this is <Year += {2007, 2008}>, i.e. the assignment operator implicitly defines a union. Also implicit intersections, exclusions and symmetric differences can be defined using "*=", "-=" and "/=".
sum( {$<Product += {OurProduct1, OurProduct2} >} Sales )
returns the sales for the current selection, but using an implicit union to add the products "OurProduct1" and "OurProduct2" to the list of selected products.
sum( {$<Year += {"20*",1997} - {2000} >} Sales )
returns the sales for the current selection but using an implicit union to add a number of years in the selection: 1997 and all that begin with "20" - however, not 2000. Note that if 2000 is included in the current selection, it will still be included after the modification. Same as <Year=Year + ({"20*",1997}-{2000})>
sum( {$<Product *= {OurProduct1} >} Sales )
returns the sales for the current selection, but only for the intersection of currently selected products and the product "OurProduct1".
Variables and other dollar-sign expansions can be used in set expressions.
sum( {$<Year = {$(#vLastYear)}>} Sales )
returns the sales for the previous year in relation to current selection. Here, a variable vLastYear containing the relevant year is used in a dollar-sign expansion.
sum( {$<Year = {$(#=Only(Year)-1)}>} Sales )
returns the sales for the previous year in relation to current selection. Here, a dollar-sign expansion is used to calculate previous year.
Advanced searches using wildcards and aggregations can be used to define sets.
sum( {$-1<Product = {"*Internal*", "*Domestic*"}>} Sales )
returns the sales for current selection, excluding transactions pertaining to products with the string 'Internal' or 'Domestic' in the product name.
sum( {$<Customer = {"=Sum({1<Year = {2007}>} Sales ) > 1000000"}>} Sales )
returns the sales for current selection, but with a new selection in the "Customer" field: only customers who during 2007 had a total sales of more than 1000000.
In the above examples, all field values have been explicitly defined or defined through searches. There is however an additional way to define a set of field values by the use of a nested set definition.
In such cases, the element functions P() and E() must be used, representing the element set of possible values and the excluded values of a field, respectively. Inside the brackets, it is possible to specify one set expression and one field, e.g. P({1} Customer). These functions cannot be used in other expressions:
sum( {$<Customer = P({1<Product={'Shoe'}>} Customer)>} Sales )
returns the sales for current selection, but only those customers that ever have bought the product 'Shoe'. The element function P( ) here returns a list of possible customers; those that are implied by the selection 'Shoe' in the field Product.
sum( {$<Customer = P({1<Product={'Shoe'}>})>} Sales )
same as above. If the field in the element function is omitted, the function will return the possible values of the field specified in the outer assignment.
sum( {$<Customer = P({1<Product={'Shoe'}>} Supplier)>} Sales )
returns the sales for current selection, but only those customers that ever have supplied the product 'Shoe'. The element function P( ) here returns a list of possible suppliers; those that are implied by the selection 'Shoe' in the field Product. The list of suppliers is then used as a selection in the field Customer.
sum( {$<Customer = E({1<Product={'Shoe'}>})>} Sales )
returns the sales for current selection, but only those customers that never bought the product 'Shoe'. The element function E( ) here returns the list of excluded customers; those that are excluded by the selection 'Shoe' in the field Product.
Hence, the full syntax (not including the optional use of standard brackets to define precedence) is
set_expression ::= { set_entity { set_operator set_entity } }
set_entity ::= set_identifier [ set_modifier ]
set_identifier ::= 1 | $ | $N | $_N | bookmark_id | bookmark_name
set_operator ::= + | - | * | /
set_modifier ::= < field_selection {, field_selection } >
field_selection ::= field_name [ = | += | ¬-= | *= | /= ] element_set_expression
element_set_expression ::= element_set { set_operator element_set }
element_set ::= [ field_name ] | { element_list } | element_function
element_list ::= element { , element }
element_function ::= ( P | E ) ( [ set_expression ] [ field_name ] )
element ::= field_value | " search_mask "
Dear Friend,
Ataached Example help you , and hope solve your problem.
Regards
Sunil Jain
Thanks for your response Suni.
I have read/gone through some documents and got the confusion with Set Analysis. Suppose i have two list box and one table box
List Boxes:
Country ,State
Table box: It has the sales data for multiple contries with columns country and state and few other related columns.
I have not used the Set Analysis. My Selection is India as the Country and Mahrastra as the State. The data displayed will be filtered to India and then Mahrastra or will it behave differently.
Note: i have not used the SET Analysis.
Regards,XXX
in this case data will be filter based on india and maharashtra. not india or maharashtra.
Regards
Sunil Jain
Thanks Sunil,
Hello Sunil,
Can you help me with this? I am having problem find a way to do set comparison where 2 sets are created from 2 different tables in schema.
I hope the table below is self explanatory.
I can get exp 3 by (exp 2 - exp 1) but I have a reason where I want to compared the values instead of numeric calculation.
End results: exp3 should say whether my set {A,B,C) is bigger than set{A}.
Would highly appreciate your help.
Table1 | Table 2 | Table 3 | Graph | ||||||||
Filed 1 | Filed 1 | key | key | Field 2 | dimension | exp1 | exp2 | exp3 (true/false) | |||
A | A | f1 | f1 | A | f1 | count(filed 1) | count(field2) | ${filed2} - ${filed1} i.e. {A,B,C} - {A} | |||
… | … | f1 | B | 1 | 3 | ||||||
… | … | f1 | C | ||||||||
A | f2 | ||||||||||
A | f3 |
I hope the complete table is visible now.
Table1 | Table 2 | Table 3 | ||||||
Filed 1 | Filed 1 | key | key | Field 2 | ||||
A | A | f1 | f1 | A | ||||
… | … | f1 | B | |||||
… | … | f1 | C | |||||
A | f2 | |||||||
A | f3 | |||||||
Graph | ||||||||
dimension | exp1 | exp2 | exp3 (true/false) | |||||
f1 | count(filed 1) | count(field2) | ${filed2} - ${filed1} i.e. {A,B,C} - {A} | |||||
1 | 3 | |||||||
@Sunil .. or anyone else,
From your example, what is the difference of advantages of:
sum({<SDate={">=$(From) <=$(To)"}>} Value)
over
SUM(IF(SDate >= From AND SDate <= To,Value))
Amien wrote:From your example, what is the difference of advantages of:
sum({<SDate={">=$(From) <=$(To)"}>} Value)
over
SUM(IF(SDate >= From AND SDate <= To,Value))
Performance. The set analysis approach is likely to perform better, particularly if the date range is small and the data set is large.
And I'll also note that your expressions are not EXACTLY the same, since the if() respects your SDate selections, while the set analysis expression overrides them. So if someone has selected SDates, you could get different results. I believe the equivalent of the IF is this:
sum({<SDate*={">=$(From) <=$(To)"}>} Value)