Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello Folks,
I'm working with a table with one dimension, and one measure. My measure is of the form SUM({<[PayType]={'HR'}>} PayQuantity)
What I would like, is that even when there is no [PayType]={'HR'} row, the sum still comes out to zero. Below is an example of what I'd like.
Desired Output:
Sometimes people will suggest add the term + SUM({1} 0) to the Set Analysis. The issue with that is that selections made by the user are ignored as that '1' means the entire dataset.
What I've tried:
SUM({<[PayType]={'HR'}>} PayQuantity) <-- The starting point
SUM({<[PayType]={'HR'}>} PayQuantity) + SUM({1} 0) <--Brings back all rows regardless of user selection
Alt(SUM({<[PayType]={'HR'}>} PayQuantity) ,0)
SUM({<[PayType]={'HR'}>} Alt(PayQuantity ,0))
RangeSum(SUM({<[PayType]={'HR'}>} PayQuantity),0) <-- Lol
Here is the dummy data that represents the problem I'm trying to solve.
Dummy Data:
Orders:
LOAD [Order Number] AS '%order_number',
*
;
LOAD * Inline
[
'Order Number'
1000,
2000,
3000
]
;
PayDetails:
LOAD * Inline
[
'%order_number','PayType', 'PayQuantity'
1000, 'HR', 10
2000, 'HR', 15
3000, 'TP', 13.5
]
;
Any help is greatly appreciated.
There has to be a cleaner approach than this, but it *should* work:
SUM({<[PayType]={'HR'}>} PayQuantity) + sum(0)
No need for set analysis on that last one. At least based on your sample data, it handles fine with just sum(0).
There has to be a cleaner approach than this, but it *should* work:
SUM({<[PayType]={'HR'}>} PayQuantity) + sum(0)
No need for set analysis on that last one. At least based on your sample data, it handles fine with just sum(0).
how about below? Your first expression is filtering out all PayTypes to only HR, hence you are excluding it and ) can't be shown... second part of my expression brings 0 for all Paytypes but it also respects PayType selections (note "*=")
SUM({<[PayType]={'HR'}>} PayQuantity)
+
SUM({<[PayType]*={'*'}>} 0)
Thanks, that did the trick. I'm kind of upset at how easy the solution was.
Haha, I also don't know why i got so fixated to use set analysis... silly me 🙂
Hi there!
Here you go - find a fully worked example attached of two solutions to your problem statement and explanatory notes below.
Proposed solution 1: sum(if(PayType='HR',PayQuantity))
Sometimes the easiest solutions are the most effective.
Consider use of a simple 'IF' statement rather than falling back on Set Analysis syntax to solve the problem.
With Set Analysis - scoping the query PayType='HR' categorically eliminates access to anything not associated with HR. The aggregation (the sum) is thus computed across the participant dimensions and referenced field inside the aggregation for the Selection State defined by the Set Analysis statement. I.e:
OrderNumber,PayType,PayQuantity
1000,HR,10
2000,HR,15
The IF statement on the other hand is subject to your current selections in full as no Set Analysis statement is applied to modify the Selections (the Set).
Hence it filters the data table:
OrderNumber,PayType,PayQuantity
1000,HR,10
2000,HR,15
3000,TP,13.5
and produces row-by row the following unaggregated results
OrderNumber,PayType,PayQuantity
1000,HR,10
2000,HR,15
3000,TP,<NULL>
thus when summed up delivering
OrderNumber,sum(if(PayType='HR',PayQuantity))
1000,10
2000,15
3000,0 <-- because summing null values is a genuine operation and different from not having a value to sum at all.
Proposed solution 2: SUM({<[PayType]={'HR'}>} PayQuantity)+Count(DISTINCT [Order Number])*0
The IF statement solution isn't viable in all scenarios - you generally want to avoid traversing tables with an IF statement as you could suffer from double counting if you happen to traverse a many-to-many relationship -> Cartesian style join effects (if that means nothing to you - just remember: stick to only using IF statements if the participant fields inside the 'IF' all come from the SAME in-memory table).
So for completeness, solution 2 is a highly optimized way that should serve your needs as stated - ensuring that a zero value is displayed for all Order Number records, whilst staying true to the reduced scope of any selections made by users. Count(DISTINCT [Order Number]) is highly optimized because:
It counts the distinct value array of 'Order Number' (referred to in Qlik speak as the 'Symbol Table') - so even if you have 1 billion fact records stamped with Order Numbers, if you only have 1,000 order numbers, that's the maximum participant values in the Count(DISTINCT [Order Number]) operation - not 1 billion.
Alright, so this was probably way more info than what you were after - but I hope you and/or other community members will find the answer enlightening.
Enjoy!
-Jonas