Solved: Re: Help with set analysis - Page 3 - Qlik Community

Report Inappropriate Content · ‎2015-05-15

Hi all,

I have a data set as shown

Participant	Question	Answer
A	1	Yes
A	2	No
B	1	Yes
B	2	Yes
C	1	N/A
C	2	N/A
D	1	Yes
D	2	N/A

A participant get to answer either yes, no or N/a for Questions 1 and 2.

My requirement is to find the percentage of responses who answered yes to question 1 and question 2 among total responses. N/A should not be considered.

Answer = ( count of responses who answered yes to question 1 and question 2) / ( count of responses to question 1 which was yes)

In above example , 3 participants answered yes to question 1 and there was one yes and one no to question 2 ( don't count N/A)

therefore- answer = 50% or (1/2)

Numerator = count({ <Question={1} , Answer={Yes}, Question={2},Answer={Yes}> } Participant) <-- not working

Denominator= count({<Question={1},Answer={'Yes'},Question={2} ,match(Answer,'Yes','No') > } Participant) -< not working

Any help on how I need to form set analysis expression for Numerator and denomitator?

Report Inappropriate Content · ‎2015-05-17

Hi Massimo. sunny,

Thank you once again for your solution and patience. I am almost there with the answer. The expression you provided works but in my real example, the question numbers are 1179 and 1168 instead of 1 and 2. Can you please view the attached file and see where I need to change the expression?

In the file attached, the answers should be same for both the data sets but it doesn't work for the second one.

sunindia‌

!

sunny_talwar · ‎2015-05-18

The two databases match up now with this addition:

Sum({<Question1 = {'1', '2'}>}aggr(if(match(concat({<Question1 = {'1','2'}>}Answer1, '', Question1),'YesNo','YesYes', 'NoYes'),1), Participant1))

Sum({<Question = {'1179', '1168'}>}aggr(if(match(concat({<Question = {'1179', '1168'}>}Answer, '', Question),'YesNo','YesYes', 'NoYes'),1), Participant))

Now the question is, that is it giving the answer you are expecting or not? PFA the application.

Best,

Sunny

Report Inappropriate Content · ‎2015-05-18

Hi Sunny,

This fix is also not working. Infact it should be 100% for both case , not 64%. ( 7 yes for 1179, and all those 7 said yes for 1168 so 7/7=100%)

The example I shared gave 100% for first data set but 64% for second data set. Now both are at 64%.

Can you please briefly explain how you are trying to achieve this so that I can work it out myself?

sunindia‌

sunny_talwar · ‎2015-05-18

What is the output you are expecting to see for %? 50% or 100% and what is the logic?

Best,

Sunny

Report Inappropriate Content · ‎2015-05-18

It should be 100% because of the 6 that answered yes to 1179, all six said yes to 1168 as well. so its 6/6 =100%

Also, of the six who said yes to 1179, if 5 said yes to 1168 and one said N/A it should still be 100% 9 (we only look for yes and no's)

If 6 said yes to 1179 but 5 said yes and one said no to 1168 then it should be 5/6

sunindia‌

sunny_talwar · ‎2015-05-18

From what I understand, this is what you are looking for:

I made a minor change in the script so that we have Yes for 1179 and No for 1168.

LOAD * INLINE [

Participant, Question, Answer

P1, 1179, Yes

P1, 1168, No

P2, 1179, Yes

P2, 1168, Yes

P3, 1179, Yes

P3, 1168, Yes

P4, 1179, Yes

P4, 1168, Yes

P5, 1179, No

P5, 1168, Yes

P6, 1179, Yes

P6, 1168, Yes

P7, 1179, No

P7, 1168, Yes

P8, 1179, No

P8, 1168, Yes

P9, 1179, No

P9, 1168, Yes

P10, 1179, N/A

P10, 1168, Yes

P11, 1179, Yes

P11, 1168, Yes

P12, 1179, N/A

P12, 1168, No

P13, 1179, Yes

P13, 1168, Yes

P14, 1179, N/A

P14, 1168, No

];

In this data we have 6 Yes for 1179 which have Yeses for 1168 (in brown) and 7 yeses in total.

HTH

Best,

Sunny

Report Inappropriate Content · ‎2015-05-18

sunindia‌

Thank you once again. I am going to use this answer as it is as close as it gets. It still is not consistent. Say if i keep the data same and change just the question numbers from 1179 to 1181 and 1168 to 1182, the answer changes ( see attached) . I am assuming its got to do with magnitude of question number, because the earlier expression works then. I sincerely appreciate you taking your time out and helping me : ) I am going to leave this thread open.

sunny_talwar · ‎2015-05-18

So the issue is that you need to decide if the independent question is bigger or small question. So for example sticking with your example 1179 and 1168. 1179 is independent and 1168 is kind of dependent as in that only where we have yeses for 1179 we will check 1168. In this case 1179 > 1168 we can use this formula

Sum({<Question = {'1179', '1168'}>}aggr(if(match(concat({<Question = {'1179', '1168'}>}Answer, '', -Question),'YesNo','YesYes'),1), Participant))

Alternatively we can use this also

Sum({<Question = {'1179', '1168'}>}aggr(if(match(concat({<Question = {'1179', '1168'}>}Answer, '', Question),'NoYes','YesYes'),1), Participant))

But when the independent number is smaller the formula will change (so for example 1168 < 1179)

Sum({<Question = {'1179', '1168'}>}aggr(if(match(concat({<Question = {'1179', '1168'}>}Answer, '', Question),'YesNo','YesYes'),1), Participant))

Alternatively we can also use

Sum({<Question = {'1179', '1168'}>}aggr(if(match(concat({<Question = {'1179', '1168'}>}Answer, '', -Question),'NoYes','YesYes'),1), Participant))

The issue is that the way the concat function is working here is driving the results for you. If you know the independent question is bigger or smaller this approach will work, if not then I think you are better off using the script approach where you can flag is accordingly without worrying about the concat order.

HTH

Best,

Sunny

Report Inappropriate Content · ‎2015-05-19

Thanks Sunny, that was indeed helpful!

sunny_talwar · ‎2015-05-19

Awsome . I am glad I was able to help.

Best,

Sunny