Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Margaret
Creator II
Creator II

set analysis should return null or zero, instead returns count of all

If we get a surge of Covid patients, and we need to activate our Overflow Emergency Department, we want to be able to know how many people are in that Department as soon as patients start being assigned to it. So I wrote a set analysis formula to count them:

=count({<Department={'EMERGENCY DEPARTMENT OVERFLOW'}>}Distinct PatientID)

Nobody is currently assigned to that Department. Yet instead of returning zero or null, it returns a count of ALL Distinct PatientIDs in the .qvw, even though none of them are associated with Department = 'EMERGENCY DEPARTMENT OVERFLOW'.

1) Why does it do this?
2) How can I work around this problem so that as soon as 1 patient is assigned that Department, the calculation returns an accurate value?

1 Solution

Accepted Solutions
marcus_sommer

The cause is that there are no values available for the field. If this happens the set analysis condition is just ignored and therefore the expression returned the results for the whole dataset. I agree that's from a user point of view not the expected behaviour. I assume that there are technically reasons because the set analysis logic based on the system- and state-tables and the normally returning TRUE or FALSE couldn't be assigned to NULL respectively to NOTHING. A technically solution for it might be to implement a further loop which checked each set analysis statement - but the performance impact would be probably quite huge. Therefore I believe it is like it is ...

An alternatively might be to ensure that there is always a field-value maybe with something like this:

concatenate(YourTable) load ' ' as YourField autogenerate 1;

Probably there are more elegant ways to implement such things and of course it has some side-effects especially if it's applied to a larger amount of fields.

 Beside this you could try to bypass such cases by applying an indirect set analysis, maybe with something like this:

count({<PatientID = {"=OverdueForSecondDose='Yes'"}>}DISTINCT PatientID)

but it's not really better as your if-approach because it's mostly the same just the syntax is different.

In both above mentioned workarounds you need to know which fields have any values or not to apply any of these logics.

A more robust alternatively could be to adjust the datamodel to switch from a more or less crosstable-structure to a stream-data structure, means changing something from:

ID   First   Second   Third
1     yes     no              yes

to something like:

ID   Category   Value
1     First             yes
1     Second      no
1     Third           yes 

and the risk to run in such an issue should be massively reduced.

- Marcus

View solution in original post

7 Replies
Taoufiq_Zarra

can you share a sample data and the expected output ?

Regards,
Taoufiq ZARRA

"Please LIKE posts and "Accept as Solution" if the provided solution is helpful "

(you can mark up to 3 "solutions") 😉
Margaret
Creator II
Creator II
Author

If this seems like anomolous behavior and going to the trouble of prepping a qvw with sharable (non-confidential) sample data is necessary, I will. But it seems like this may be Working as Designed, in which case I would like to understand it better and how others handle this kind of situation. So I'm interested in input on that.

 

Brett_Bleess
Former Employee
Former Employee

About the best I can offer is the following Design Blog post that may provide some further information regarding Set Analysis:

https://community.qlik.com/t5/Qlik-Design-Blog/A-Primer-on-Set-Analysis/ba-p/1468344

There are some further links to other posts at the bottom, just FYI.

If you want to search for other posts, use the following link:

https://community.qlik.com/t5/Qlik-Design-Blog/bg-p/qlik-design-blog

Hopefully that may get you what you need to explain things, the only other thing would be to review the Help documentation too:

https://help.qlik.com/en-US/qlikview/April2020/Subsystems/Client/Content/QV_QlikView/ChartFunctions/...

Regards,
Brett

To help users find verified answers, please do not forget to use the "Accept as Solution" button on any post(s) that helped you resolve your problem or question.
I now work a compressed schedule, Tuesday, Wednesday and Thursday, so those will be the days I will reply to any follow-up posts.
Margaret
Creator II
Creator II
Author

Can anyone speak to this kind of situation? It isn't an anomoly. It's happening again when I try to count how many patients are overdue for their second dose of Covid-19 vaccine (and other situations). Nobody is overdue yet. But this returns a count of EVERYONE:

=count({<OverdueForSecondDose={'Yes'}>}DISTINCT PatientID)

the workaround is to check if there is a value in that field at all, and not count it if there isn't:

if(LEN(OverdueForSecondDose) > 0
,count({<OverdueForSecondDose={'Yes'}>}DISTINCT PatientID)
,0
)

But I'd like to understand why it is doing that and hopefully learn how to calculate without having to check for the length of the value in OverdueForSecondDose. 

Thank you.

Anil_Babu_Samineni

I blindly trust on your model that may have some field associated with this otherwise data can be static rather data points? can you check that and confirm?

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful
marcus_sommer

The cause is that there are no values available for the field. If this happens the set analysis condition is just ignored and therefore the expression returned the results for the whole dataset. I agree that's from a user point of view not the expected behaviour. I assume that there are technically reasons because the set analysis logic based on the system- and state-tables and the normally returning TRUE or FALSE couldn't be assigned to NULL respectively to NOTHING. A technically solution for it might be to implement a further loop which checked each set analysis statement - but the performance impact would be probably quite huge. Therefore I believe it is like it is ...

An alternatively might be to ensure that there is always a field-value maybe with something like this:

concatenate(YourTable) load ' ' as YourField autogenerate 1;

Probably there are more elegant ways to implement such things and of course it has some side-effects especially if it's applied to a larger amount of fields.

 Beside this you could try to bypass such cases by applying an indirect set analysis, maybe with something like this:

count({<PatientID = {"=OverdueForSecondDose='Yes'"}>}DISTINCT PatientID)

but it's not really better as your if-approach because it's mostly the same just the syntax is different.

In both above mentioned workarounds you need to know which fields have any values or not to apply any of these logics.

A more robust alternatively could be to adjust the datamodel to switch from a more or less crosstable-structure to a stream-data structure, means changing something from:

ID   First   Second   Third
1     yes     no              yes

to something like:

ID   Category   Value
1     First             yes
1     Second      no
1     Third           yes 

and the risk to run in such an issue should be massively reduced.

- Marcus

Margaret
Creator II
Creator II
Author

Thank you, Marcus! That's super helpful.

I am also thinking about changing the SQL that creates the dataset to populate OverdueForSecondDose with "No" when it is not "Yes". PatientIDs that don't exist in that dataset would still have null for that column but I think I need only one value for the simpler SetAnalysis to work properly.