Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there,
I've been scratching my head over the following task. Trying to calculate a count and sum for unique items in groups. Here is a sample table:
# | ItemGroup | ItemPrice | ItemId | OtherColumns... |
1 | ABC | 100.00 | 123 | |
2 | DEF | 100.00 | 123 | |
3 | ABC | 150.00 | 125 | |
4 | ABC | 200.00 | 127 | |
5 | DEF | 150.00 | 131 | |
6 | DEF | 150.00 | 133 | |
7 | GHI | 250.00 | 142 | |
8 | GHI | 150.00 | 131 |
Now, imagine that the list is filtered by ItemId={123,127} so that it only includes items # 1, 2, and 4. Also note that one item may belong to multiple groups, but no items are duplicated within the same group. The number of groups is large enough so I can't just enumerate them within a set expression.
I need two formulas that would allow me to get two metrics (don't have to show them in the table):
a) Number of all unique items in the table that are present in at least one group that is left in the current selection
b) Sum of prices of all unique items in the table that are present in at least one group that is left in the current selection
For my example, the solutions are:
a) 3 unique items in group ABC + 2 unique items in group DEF + 0 from group GHI as it is not present in the current selection = 5
b) (100 + 150 + 200 from group ABC) + (150 + 150 from group DEF) + (0 from group GHI) = 850
To make that kind of calculation, I will need to go beyond the applied ItemId filter to get all the items from the groups that are left in the current selection. In addition, I will need to make sure each unique item and its price is only counted once.
I would appreciate any directions as it seems I'm out of options. Thank you!
I found the second problem (price) challenging. Your solution works for the 123,127 example, but what about 131? That produces incorrect (I think) results from your measures for 131. The measures I came up with are:
=count({<ItemGroup=p({<ItemGroup>}), ItemId=>}DISTINCT ItemId)
=sum({<ItemGroup=p({<ItemGroup>}), ItemId=>}aggr(avg({<ItemGroup=p({<ItemGroup>}), ItemId=>}ItemPrice), ItemId))
Attached is a sample app showing both your and my calculations.
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
Hey, I may have figured it out.
Here are my formulas:
a) =Sum(Aggr(Count({$<[ItemId]=>}Aggr(Only({$<[ItemId]=>}ItemId),ItemId)),ItemGroup))
b) =Sum(Aggr(Sum({$<[ItemId]=>}Aggr(Only({$<[ItemId]=>}ItemPrice),ItemId)),ItemGroup))
The nested Aggr makes a list of all unique items, then the second Aggr comines the counts/soum into groups, which are being summed by the last function.
I've run some tests, and it seems to be working fine. Feel free to reuse for your own good. Happy data analyzing!
Let me know if you spot any issues or have any questions to discuss further.
I found the second problem (price) challenging. Your solution works for the 123,127 example, but what about 131? That produces incorrect (I think) results from your measures for 131. The measures I came up with are:
=count({<ItemGroup=p({<ItemGroup>}), ItemId=>}DISTINCT ItemId)
=sum({<ItemGroup=p({<ItemGroup>}), ItemId=>}aggr(avg({<ItemGroup=p({<ItemGroup>}), ItemId=>}ItemPrice), ItemId))
Attached is a sample app showing both your and my calculations.
-Rob
http://masterssummit.com
http://qlikviewcookbook.com
http://www.easyqlik.com
Nicely done, Rob!
Your solution is certainly superior to mine, and upon taking another look at my nested Aggrs, I think I can see the flaw in my approach - when making the initial list of unique values, it does not take into account which groups are in the current selection, and so the solution will only work in certain cases.
It's great to see the power of p() function in action - duly noted.
One thing that I have hard time to understand is the use of Avg inside the Aggr for calculating the sum of uniques. Would you mind elaborating a bit on this move so I can understand 100% of your approach?
Avg() was a way of extracting (and passing to the sum) a single price for each ItemId. I probably could have used only() since price is the same for each occurance of ItemId. Using Avg() does protect though against the case where there may be different prices for the same ItemId.
-Rob