Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Standard deviation and weighted average

Hi,

I am analysing aggregated survey results using Qlikview. The results  of the survey are on a scale 1-5, so for example I have:

Score 1 - 12 respondents

Score 2 - 23 respondents

Score 3 - 45 respondents

Score 4 - 120 respondents

Score 5 - 35 respondents

Each rating score is a separate dimension in the data source and is aggregated data.

I am looking to calculate two statistics:

1) the weighted average score

2) the standard deviation of the mean

I know that to calculate the weighted average in excel it would be (1 x 12) + (2 x 23) + (3 x 45) +(4 x 120) + (5 x 35) /   235  (number of respondents) = 3.6

From this I need to calculate the standard deviation of the mean.

But I am having trouble expressing this in qlikview.

Any help would be appreciated.

1 Reply
johnw
Champion III
Champion III

The simplest approach might be to unaggregate the scores again.  For instance:

Raw:
LOAD Score as Score2
RESIDENT Aggregated
WHILE iterno()<=Respondents
;

Then you can just use avg(Score2) and stdev(Score2).  Even if you have a huge sample size, it's only a single small integer getting duplicated (Edit: well, plus a key field to connect it to your data), so the memory requirements might not be bad.  If that is impractical for you, here's a way to calculated the weighted average and the standard deviation without unaggregating:

weighted average   = sum(Score*Respondents)/sum(Respondents)
standard deviation = sqrt(sum(Respondents*sqr(Score-sum(total Score*Respondents)/sum(total Respondents)))/(sum(Respondents)-1))

You may or may not want the final -1 in the standard deviation.  It's required to duplicate what QlikView does, and Wikipedia indicates that this is used when you have a sample from a population.  If you have ALL responses, then it may be appropriate to remove it.


See attached.