Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
pjonathon
Contributor
Contributor

Set Intersection

Having trouble understanding and implementing set intersection. I think I am following the correct syntax but the outcome is different to what I expect.

What I am trying to implement is the following: (A or B) AND (C or D)

I only want to count items that were sold in Region A or B and region C or D

Here is my load script:

Sales:
Load * inline

[

Item,Region
Item1,A
Item2,B
Item3,C
Item4,D
Item1,D
Item2,C
];

Here is my formula: count(distinct{<Region={'A','B'},Region={'C','D'}>} Item)

I expect to get a count of 2, since Item1 was sold in region A and D and Item2 was sold in region B and C

Does anyone have an idea on what is wrong with my formula?

 

Labels (1)
1 Solution

Accepted Solutions
marcus_sommer

I think this kind of intersection isn't solvable with a single aggregation because Region could have only one value on a record-level and therefore each AND condition must fail. But within a chart with Item as dimension you could use something like this:

count({< Region = {'A', 'B'}>} Region) * count({< Region = {'C', 'D'}>} Region)

and should it be counted within a KPI it might be wrapped with an aggr() like:

sum(aggr(count({< Region = {'A', 'B'}>} Region) * count({< Region = {'C', 'D'}>} Region), Item))

- Marcus

View solution in original post

7 Replies
Anil_Babu_Samineni

I know your intend, but you can simplify like below if output expected same

Count({<item={"= count(item)>1">} item)

Or, your way may be

Count({<region={'A', 'B'}>+<region={'C', 'D'>} item)

Or, this also optimised if you have many records

Count(if((region='A' or region='B') and (region='C' or region='D'), item))

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
rogerpegler
Creator II
Creator II

Try:

count(distinct{<Region={'A','B'}>*<Region={'C','D'}>} Item)

The '*' gives you an intersection between each set defined by <..>. 

pjonathon
Contributor
Contributor
Author

The first formula count(distinct{<Region={'A','B'}>+<Region={'C','D'}>} Item) results in 6. Which is not what I am looking for.  Since "+" represents Union I see why I get 6. I am looking for Intersection.

Count(if( (Region='A' or Region='B') and (Region='C' or Region='D'), Item)) results in 0

 

pjonathon
Contributor
Contributor
Author

count(distinct{<Region={'A','B'}>*<Region={'C','D'}>} Item) results in 0

Anil_Babu_Samineni

Can you please post reality data. I didn't understand the logic from sample data

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
marcus_sommer

I think this kind of intersection isn't solvable with a single aggregation because Region could have only one value on a record-level and therefore each AND condition must fail. But within a chart with Item as dimension you could use something like this:

count({< Region = {'A', 'B'}>} Region) * count({< Region = {'C', 'D'}>} Region)

and should it be counted within a KPI it might be wrapped with an aggr() like:

sum(aggr(count({< Region = {'A', 'B'}>} Region) * count({< Region = {'C', 'D'}>} Region), Item))

- Marcus

pjonathon
Contributor
Contributor
Author

This worked. I am getting the expected results. Explanation was right on - reading the aggr function, it makes sense.