Re: For Qlik experts. How associations works in {1... - Qlik Community

evgeniystuchalk · ‎2020-09-28

Hi, community. Recently i've tried to make set analysis tutorial video. As simple demonstration of it's power, i wanted to show example with {1-$} on simple model:

Inline Data:

Spoiler

Sales: load * inline [
CityID,SalesSum,ProductID
1,100,1
1,150,2
1,200,3
1,170,4
1,50,5
1,180,6
2,90,1
2,40,3
2,60,5
3,80,1
];

Cities: load * inline [
CityID, City
1, Moscow
2, St.Peterburg
3, Samara
];

Products: load * inline [
ProductID, ProductName, CategoryID
1, P1,1
2, P2,1
3, P3,2
4, P4,2
5, P5,3
6, P6,3
];

Categories: load * inline [
CategoryID, CategoryName
1, C1
2, C2
3, C3
];

Sales: load * inline [CityID,SalesSum,ProductID1,100,11,150,21,200,31,170,41,50,51,180,62,90,12,40,32,60,53,80,1];Cities: load * inline [CityID, City1, Moscow2, St.Peterburg3, Samara];Products: load * inline [ProductID, ProductName, CategoryID1, P1,12, P2,13, P3,24, P4,25, P5,36, P6,3];Categories: load * inline [CategoryID, CategoryName1, C12, C23, C3];

In visual layer, i've created table that shows sum of sales by categories.

When i'm selecting city, in second measure of table i expect to see in sum of sales of excluded cities by categories.

Sum of of excluded sales calculated correct, but it not associated with categories, if selected city have sales in this categories.

Selected City with 3 categories:

Selected City with 1 category:

However, if we joined Sales, Products and Categories tables in script load, everything works as expected.

Joined Tables Code Inline:

Spoiler

Sales: load * inline [
CityID,SalesSum,ProductID
1,100,1
1,150,2
1,200,3
1,170,4
1,50,5
1,180,6
2,90,1
2,40,3
2,60,5
3,80,1
];

Products: left join load * inline [
ProductID, ProductName, CategoryID
1, P1,1
2, P2,1
3, P3,2
4, P4,2
5, P5,3
6, P6,3
];

Categories: left join load * inline [
CategoryID, CategoryName
1, C1
2, C2
3, C3
];

Cities: load * inline [
CityID, City
1, Moscow
2, St.Peterburg
3, Samara
];

Sales: load * inline [CityID,SalesSum,ProductID1,100,11,150,21,200,31,170,41,50,51,180,62,90,12,40,32,60,53,80,1];Products: left join load * inline [ProductID, ProductName, CategoryID1, P1,12, P2,13, P3,24, P4,25, P5,36, P6,3];Categories: left join load * inline [CategoryID, CategoryName1, C12, C23, C3];Cities: load * inline [CityID, City1, Moscow2, St.Peterburg3, Samara];

while i'm writing this, some conclusions already generated 🙂

1) Set Analysis works on level of data model tables, not on hypercube level. It means, in first case, we excluding all DATA MODEL records, that related to current selections, that means 3 out of 3 categories records, from Category Dictionary. And for {1-$} scenario, no categories left to display for this measure.

2) In second case, we also excluding records with 3 categories that related to selected city. But! Excluded sales records have their own category names, and it means we can see sum for excluded cities by category.

3) For analytics of excluded scenarios, more reliable use e() and p() functions.

Also, it will be nice to hear some commentaries and advices from experts 🙂

upd. 4) sum({1} Sales) - sum({$} Sales) also work with Categories in separate table, and allow to avoid missing data, if some of values in selected fields is null.

@hic

@Oleg_Troyansky

marchoctober · ‎2020-09-28

@stevedark @barryharmsen Hey Mates, what would you think regards Eugene's finding? Thanks you!

Oleg_Troyansky · ‎2020-09-28

This is a very interesting observation! I've never thought of it this way, mostly because I never used {1-$}, other than giving it as an example of using Set Analysis Operators with Set Identifiers in my training classes.

However, once you see this behavior, it's not too hard to explain why does it happen.

In the normalized data structure, the set of data that is associated with the Current Selections ($), includes all the associated Fact records, and all other Dimensional records that are associated with the current selection. So, when a single Store is selected, all associated Products and Categories are being included in the Set $. When we exclude the Set $ (current selections) from the Set 1 (all data), all these associated products and categories get excluded from the resulting Set. Hence, it looks like the Set Analysis "doesn't work".

In the de-normalized structure, when all Dimensions are joined into the Fact, each row of the Fact carries its own copy of the dimensional attributes, such as Product and Category. So, now the {1-$} produces the expected results, because all the transactions in the Fact table that survived the exclusion, have their own instances of Products and Categories that are stored in the same table. In this case, it looks like Set Analysis "worked correctly".

In reality, Set Analysis worked the same way in both cases, however the difference in the data structure caused completely different results.

And of course, I completely agree with your assessment that it's always better to use functions P() and E() to exclude particular selected Dimensional values from a calculation.

Using the opportunity, let me invite all interested developers to the on-line version of our Masters Summit for Qlik 2020. We called it "Masters Summit @ Home". I'm opening on November 18th with my popular session on Set Analysis and Advanced Aggregation. Visit MastersSummit.com for details.

Oleg Troyansky
Call me for expert support, performance tuning, advanced training and more!

For Qlik experts. How associations works in {1-$} set analysis?

Advanced Authoring

Associative

Set Analysis