Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Inspired by swuehl's response here: Help with Standard deviation Full population No... | Qlik Community I thought it make sense to share a way to calculate population standard deviation as QlikView's Stdev() seems to be a sample standard deviation. The difference between the two becomes negligible as the number of data point increases, but there might be a use case to apply population standard deviation for a smaller dataset.
I start with no dimension in an Excel file
If you look at the statestic box, you will see that the standard deviation will show the sample stdev number
Also, when you use =Stdev(Data1), it will show the same number as above. In order to get population stdev, all you need to do is to multiple the standard deviation with this
=Stdev(Data1) * sqrt((Count(Data1)-1)/Count(Data1))
In other words, I am multiplying the Sample standard deviation with Sqrt((Number of Observations - 1)/Number of observations). So, for the above sample it would be Sqrt(9/10).
And the same logic can be carry forward to a table to create confidence intervals using TOTAL qualifer
Sample
=Stdev(TOTAL Data1)
Population
=Stdev(TOTAL Data1) * sqrt((Count(TOTAL Data1)-1)/Count(TOTAL Data1))
Same idea can be applied if you want to calculate Sample or Population Standard Deviation by Dimension
In the a chart, you can use these expressions
Sample
=Stdev(TOTAL <Region> Data3)
Population
=Stdev(TOTAL <Region> Data3) * sqrt((Count(TOTAL <Region> Data3)-1)/Count(TOTAL <Region> Data3))
Great post, thank you @sunny_talwar !
This is such a great post @sunny_talwar . How would I adjust the standard deviation to work of a weighted average, rather than the sample average?
Hello all.
I am new to QLIK. I need this same formula, but it is not working for me for some reason. Should I replace the > with a comma?
=Stdev(TOTAL <Region> Data3) * sqrt((Count(TOTAL <Region> Data3)-1)/Count(TOTAL <Region> Data3))
So it will be
=Stdev(TOTAL Region, Data3) * sqrt((Count(TOTAL Region, Data3)-1)/Count(TOTAL Region, Data3))
Ok. I figure it out. But I am getting an extremely high number.