
Normal probability plot
Erica Whalley Dec 11, 2011 11:18 AM (in response to Aldo Liaks)Hi Aldo, by "normal probability plot" I guess you mean one of these?
These are used to determine how well a dataset fits the normal distribution, important before you decide whether to base other statistical indicators on it.
The idea here is that for each data point, you plot the actual Z value (the number of standard deviations above/ below the mean), and then the Z value you would expect from that point in the cumulative distribution. The x axis are "simulated" Z values and the y axis are the actual Z values for that dataset.
If the dataset is fantastically normal, you will find that they fit a straight line diagonally along the centre of the plot, IE where X=Y. The way that the plotted line skews from the centre can also tell you more about the distribution. See http://www.skymark.com/resources/tools/normal_test_plot.asp for more information.
These charts are great for an "at a glance" view of telling whether the data fits or not.
So how to plot one? Start with a new chart, set the type to Scatter plot and select your dimension.
We then need to calculate 2 expressions, the "actual" Z score and "estimated" Z score.
Actual Z score:
this is the (value  average) / standard deviation. You can work this out with functions in Qlikview.
EG if your data point is quantity, your expression would look like this:
=(quantity average(total quantity)) / stdev(total quantity)
Estimated Z score
This is the Z score that the point should correspond to, based on it's cumulative percentile within the distribution. Qlikview has an inverse normal function, so once you work out the percentile, you can calculate what the Z score should be from that.
It also has a rank function, so I have used that to calculate the percentile. NB, it ranks highest to lowest, so you need to deduct the percentage from 1 in order to reverse it.
=NORMINV( 1(rank(quantity)  0.5) / count(distinct total [dimension]) , 0,1)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(Percentile) ~~~
0 and 1 standardise the normal
distribution to plain zscores
Set the Expected Z to your x axis by placing it first in the expressions, and the actual z to your Y axis.
The chart should then plot something similar to above.
I am not sure however, how to get a diagonal reference line. Perhaps someone else could help with that!
The best work arounds I have for that is to either base the chart as a line chart, with the "expected Z" as an aggregate dimension, or to float another chart with exactly the same dimensions on top with a diagnonal line throught it. Or just draw a line. But that is for another post.
Hope that helps
Erica

Re: Normal probability plot
Aldo Liaks Dec 11, 2011 2:38 PM (in response to Erica Whalley)Erica,
It is exacly what I was looking for.
Great explanation.
Thanks a lot.
First of all, thanks for your answer.
What about the diagonal line?
Another thing... I also need to plot a "Gauss Bell" (I don't know the exact denomination in english). Any chance you know how to solve it?
Thanks again,
Aldo.
Message was edited by: Aldo Liaks

Re: Normal probability plot
Stefan Wühl Dec 11, 2011 5:29 PM (in response to Aldo Liaks)Erica,
very nice explanation!
Aldo, you were asking for a diagonal line indicating a normal distribution with mean 0 and stdev 1, as Erica said, I think this could be done using a line chart and expected as an aggregated dimension. Maybe like attached?
Have a nice weekend,
Stefan

normal probability plot.qvw 149.5 K

Re: Normal probability plot
Aldo Liaks Dec 12, 2011 1:58 AM (in response to Stefan Wühl )Thanks Stefan!

Normal probability plot
Erica Whalley Dec 13, 2011 3:13 PM (in response to Stefan Wühl )I think the example is exactly what I was talking about, cheers.

Normal probability plot
Aldo Liaks Dec 17, 2011 2:54 PM (in response to Erica Whalley)Your example is exactly what I was looking for.
I believe can be pretty fun (and usefull) adding statistical analysis to several QV models.
Thanks.
Aldo.




Re: Normal probability plot
boonkiat tay Jan 4, 2015 10:36 PM (in response to Erica Whalley)Hi Erica,
I am newbie.
possible to make your solution work when the dataset got duplicated same value?
example as attach.
I found out, when there is duplicated same value, there will be missing data point plotted on the chart.
Thanks in advance.

normal probability plot(1).qvw 147.2 K

Re: Normal probability plot
Erica Whalley Mar 12, 2015 1:04 PM (in response to boonkiat tay)Hi Parrichtaty
This is beacause the duplicated value is being used as a dimension, so it is being "grouped". There are three options when this happens:
1) Leave it grouped! The values would yield the same result, and if you were to click on the point in the chart, those two items would be selected. You'd need to make sure, however that any ranking functions, or other aggregated calculations, sums etc take into account the fact that there are two values. You can set the Qlikview rank() function to return the same rank when two items have the same value.
2) Add a second dimension into the line chart that has a unique value for every x. This will separate them out
3) Use a scatterplot, not a line chart to produce the charts
Erica



Normal probability plot
Erica Whalley Dec 13, 2011 3:12 PM (in response to Aldo Liaks)Thanks stefan  I have to admit I enjoyed doing that explanation, I love seeing people use and understand statistics properly, better still incorportae them into reports that are accessed by many, ie via Qlikview!
Aldo: Gaussian functions are a type of equation that produce a bellshaped curve, the normal distribution being the most "famous" member of this family. A gauss bell simply describes the curve that is plotted on a graph. What exactly are you trying to do?
Are you trying to plot your own data and see if it is bellshaped? This is very simple. In this case use a bar chart to plot the count of your data, but with a calculated dimension using a class() function.
EG in my example of kidney disease rates above I set the dimension as =class(Rate,0.01) and the expression is simply a count of each group. The result is shown below, roughly bell shaped, typical of a lot of data (but never to be assumed!!!)
Is this what you are looking for?

Normal probability plot
lyonnaisdu26 Dec 21, 2011 8:04 AM (in response to Erica Whalley)Hi Erica.
Can you explain how you make this graph?
Can you give me your qvw ? I have lot of data and i want to see if the distribution is following a normal probability plot ?
Thanks a lot.
Regards.

Re: Normal probability plot
Ski Sim Dec 3, 2013 11:39 AM (in response to Erica Whalley)Wow, you guys are helpful. I am trying to build a Z table in order to generate a Probability plot. But then...when I do some research on this forum. I found the answer. Thank you guys.
