Standard Deviation (Population vs. Sample)

    Inspired by swuehl's response here: Help with Standard deviation Full population No... | Qlik Community I thought it make sense to share a way to calculate population standard deviation as QlikView's Stdev() seems to be a sample standard deviation. The difference between the two becomes negligible as the number of data point increases, but there might be a use case to apply population standard deviation for a smaller dataset.

     

    I start with no dimension in an Excel file

    Capture.PNG

     

    If you look at the statestic box, you will see that the standard deviation will show the sample stdev number

    Capture.PNG

     

    Also, when you use =Stdev(Data1), it will show the same number as above. In order to get population stdev, all you need to do is to multiple the standard deviation with this

     

    =Stdev(Data1) * sqrt((Count(Data1)-1)/Count(Data1))

     

    In other words, I am multiplying the Sample standard deviation with Sqrt((Number of Observations - 1)/Number of observations). So, for the above sample it would be Sqrt(9/10).

     

    Capture.PNG

     

    And the same logic can be carry forward to a table to create confidence intervals using TOTAL qualifer

    Sample

    =Stdev(TOTAL Data1)

     

    Population

    =Stdev(TOTAL Data1) * sqrt((Count(TOTAL Data1)-1)/Count(TOTAL Data1))

    Capture.PNG

     

    Same idea can be applied if you want to calculate Sample or Population Standard Deviation by Dimension

     

    Capture.PNG

     

    In the a chart, you can use these expressions

     

    Sample

    =Stdev(TOTAL <Region> Data3)

     

    Population

    =Stdev(TOTAL <Region> Data3) * sqrt((Count(TOTAL <Region> Data3)-1)/Count(TOTAL <Region> Data3))


    Capture.PNG