1 Reply Latest reply: Sep 8, 2011 10:31 PM by Daniel Rozental RSS

    Generating random numbers with Normal (Gaussian) distribution

      Hey Everyone,

      I’m not sure about you, but sometimes I need to generate random data, and occasionally I need random data that actually looks like something. In my foray into attempting to create random data with a normal distribution for some sample dashboards I was building I discovered the Box-Muller transformation.

       

      Box-Muller uses some clever trigonometry, and properties of the unit circle to transform two random numbers between 0 and 1 (such as those generated by the RAND() function) into one (or two) random numbers that conform to a normal distribution with a mean value of 0, and a standard deviation of 1.

       

      Details on the math of the transformation can be found here: http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform

       

      There are two main forms of the Box-Muller transformation; I’ve used the basic form with the trig functions to good effect.

      The function, in QlikView syntax looks like this:

       

       

      sqrt(-2*log(rand())) * cos(2* pi() * rand())

       

       

      For my purposes, I just discarded the other half of the transformation using sin(), since they both accomplish the same thing.

       

      Now with the handy AUTOGENERATE function in QlikView, I’ve generated a bunch of normal data, but how often are you going to need numbers with a normal distribution with a mean of 0 and a standard deviation of 1? Probably not very often.

       

      The good news is, the box-muller transformation function can be shifted, just like function transformations you learned in high school math. All you need to do is multiply the number generated by the desired standard deviation, and add the desired mean.

      So:

       

       

      set fxGaussianRandom = sqrt(-2*log(rand())) * cos(2* pi() * rand());

      LOAD

      ($(fxGaussianRandom) * 15) + 35 AS AGE

      AUTOGENERATE(1000);

       

       

      Creates 1000 records with a normal distribution whose average is 35, and standard deviation is 15.

      Super handy, and simpler than I thought.

      Happy Randoming!