Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
If we get a surge of Covid patients, and we need to activate our Overflow Emergency Department, we want to be able to know how many people are in that Department as soon as patients start being assigned to it. So I wrote a set analysis formula to count them:
=count({<Department={'EMERGENCY DEPARTMENT OVERFLOW'}>}Distinct PatientID)
Nobody is currently assigned to that Department. Yet instead of returning zero or null, it returns a count of ALL Distinct PatientIDs in the .qvw, even though none of them are associated with Department = 'EMERGENCY DEPARTMENT OVERFLOW'.
1) Why does it do this?
2) How can I work around this problem so that as soon as 1 patient is assigned that Department, the calculation returns an accurate value?
The cause is that there are no values available for the field. If this happens the set analysis condition is just ignored and therefore the expression returned the results for the whole dataset. I agree that's from a user point of view not the expected behaviour. I assume that there are technically reasons because the set analysis logic based on the system- and state-tables and the normally returning TRUE or FALSE couldn't be assigned to NULL respectively to NOTHING. A technically solution for it might be to implement a further loop which checked each set analysis statement - but the performance impact would be probably quite huge. Therefore I believe it is like it is ...
An alternatively might be to ensure that there is always a field-value maybe with something like this:
concatenate(YourTable) load ' ' as YourField autogenerate 1;
Probably there are more elegant ways to implement such things and of course it has some side-effects especially if it's applied to a larger amount of fields.
Beside this you could try to bypass such cases by applying an indirect set analysis, maybe with something like this:
count({<PatientID = {"=OverdueForSecondDose='Yes'"}>}DISTINCT PatientID)
but it's not really better as your if-approach because it's mostly the same just the syntax is different.
In both above mentioned workarounds you need to know which fields have any values or not to apply any of these logics.
A more robust alternatively could be to adjust the datamodel to switch from a more or less crosstable-structure to a stream-data structure, means changing something from:
ID First Second Third
1 yes no yes
to something like:
ID Category Value
1 First yes
1 Second no
1 Third yes
and the risk to run in such an issue should be massively reduced.
- Marcus
can you share a sample data and the expected output ?
If this seems like anomolous behavior and going to the trouble of prepping a qvw with sharable (non-confidential) sample data is necessary, I will. But it seems like this may be Working as Designed, in which case I would like to understand it better and how others handle this kind of situation. So I'm interested in input on that.
About the best I can offer is the following Design Blog post that may provide some further information regarding Set Analysis:
https://community.qlik.com/t5/Qlik-Design-Blog/A-Primer-on-Set-Analysis/ba-p/1468344
There are some further links to other posts at the bottom, just FYI.
If you want to search for other posts, use the following link:
https://community.qlik.com/t5/Qlik-Design-Blog/bg-p/qlik-design-blog
Hopefully that may get you what you need to explain things, the only other thing would be to review the Help documentation too:
Regards,
Brett
Can anyone speak to this kind of situation? It isn't an anomoly. It's happening again when I try to count how many patients are overdue for their second dose of Covid-19 vaccine (and other situations). Nobody is overdue yet. But this returns a count of EVERYONE:
=count({<OverdueForSecondDose={'Yes'}>}DISTINCT PatientID)
the workaround is to check if there is a value in that field at all, and not count it if there isn't:
if(LEN(OverdueForSecondDose) > 0
,count({<OverdueForSecondDose={'Yes'}>}DISTINCT PatientID)
,0
)
But I'd like to understand why it is doing that and hopefully learn how to calculate without having to check for the length of the value in OverdueForSecondDose.
Thank you.
I blindly trust on your model that may have some field associated with this otherwise data can be static rather data points? can you check that and confirm?
The cause is that there are no values available for the field. If this happens the set analysis condition is just ignored and therefore the expression returned the results for the whole dataset. I agree that's from a user point of view not the expected behaviour. I assume that there are technically reasons because the set analysis logic based on the system- and state-tables and the normally returning TRUE or FALSE couldn't be assigned to NULL respectively to NOTHING. A technically solution for it might be to implement a further loop which checked each set analysis statement - but the performance impact would be probably quite huge. Therefore I believe it is like it is ...
An alternatively might be to ensure that there is always a field-value maybe with something like this:
concatenate(YourTable) load ' ' as YourField autogenerate 1;
Probably there are more elegant ways to implement such things and of course it has some side-effects especially if it's applied to a larger amount of fields.
Beside this you could try to bypass such cases by applying an indirect set analysis, maybe with something like this:
count({<PatientID = {"=OverdueForSecondDose='Yes'"}>}DISTINCT PatientID)
but it's not really better as your if-approach because it's mostly the same just the syntax is different.
In both above mentioned workarounds you need to know which fields have any values or not to apply any of these logics.
A more robust alternatively could be to adjust the datamodel to switch from a more or less crosstable-structure to a stream-data structure, means changing something from:
ID First Second Third
1 yes no yes
to something like:
ID Category Value
1 First yes
1 Second no
1 Third yes
and the risk to run in such an issue should be massively reduced.
- Marcus
Thank you, Marcus! That's super helpful.
I am also thinking about changing the SQL that creates the dataset to populate OverdueForSecondDose with "No" when it is not "Yes". PatientIDs that don't exist in that dataset would still have null for that column but I think I need only one value for the simpler SetAnalysis to work properly.