Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Average, Standard deviation, Confidence interval, ...

Hi guys !

I'm new to Qlik Sense, and i'm trying to develop an app.

One of my source data file has the same format as the one you will find attached to this message : 4 raws, the first with a category of product, the second with the type of product (can be the same type but not the same category), the third for the sales (whatever the unit is), and the fourth, the date of sale.

What I would like to do seems easy, but I couldn't find how to make it :

I need to calculate and display the average of sales per day, but making first the sub-total for each category.

I explain : for instance, the total of sales for category A and day 01/01/2010 is 50 (15+25+10)

for category B same day is 60 (30+20+10)

for category C same day 70 (20+20+30)

What I need to do is to calculate the average of sales for categories A B and C per day. For this example and the date 01/01/2010, it would be (50+60+70)/3 = 60. And so on for every days... and drawing the curve.

And the thing I was doing till now was calculating the average for day 01/01/2010 of each individual value (like (15+25+10+30+20+10+20+20+30) / 9 ) ... So not the same result at all ..

How should I do to perform my calculation ? You would help me so much guys !

Once this will be done, I'll have another problem of confidence interval :

I'll need to calculate the standard deviation between the total for categories A B and C per day (like for day 01/01/2010 : standard deviation between 50, 60 and 70, for the date 02/01/2010, standard deviation between [total category A at that day = 40] and [total category B at that date = 30] and [total category C at that date = 50] etc..), in order to display a 90% confidence interval around my average curve (of course my original data are way more complete than this example file, but the way to think is exactly the same).

So if you already have a clue for that as well ....

Thanks sooo much, waiting for your help !

Fab.

10 Replies
sunny_talwar

Are these the numbers you expect to see?

Capture.PNG

Not applicable
Author

Exactly what I needed ! Thanks so much.

Now I can draw my confidence interval, using the statistics normal distribution law ! like placing my upper and lower lines with the formula :

average + Z (according to Z table) * Stdev / sqrt(total number) (for the max line)

or average - Z (according to Z table) * Stdev / sqrt(total number) (for the min line)

No better solution ?

sunny_talwar

I think that is how you will have to do this... can't think of any better way to do this

Not applicable
Author

I finally could do this, thanks again for your good help !

An easier way for a 95% interval was just to display

Avg(Aggr(Sum(Sales), Category, Date))  +/-  2 * Stdev(Aggr(Sum(Sales), Category, Date))

I just remembered this 1, 2 or 3 sigmas rule ! ahah

So the same if you want a 68% interval, just substitute factor 2 by factor 1, and for a 99% interval, substitute factor 2 by factor 3, and it's done !

Works perfectly for me !

Not applicable
Author

Hi there !

Another question related to this previous example, that we solved last week :

The solution allowed us to draw another graph with the average and a confidence interval, that's great.

But what if now, I would like to display this average and interval in the same graph as the one representing the sales on vertical axis, and for dimensions date and category ?

The problem is that, I have two dimensions : Dates and Category, and the formulas Sunny adviced me last week are to be set as measures. However, you can't draw a graph with several dimensions AND several measures. You can just with several dimensions and one measure, or several measures and one dimension.

What do you advice me guys ?

What i want to see is the 3 curves of evolution of the sales along the time (one per category), and on the same graph, the average, and the confidence interval as we found last week. In fact, the final aim would be to be able to see when one curve goes out of this interval and is in a "risky" situation.

Thanks in advance for your precious help

sunny_talwar

Would you be able to share a sample outside of Qlik Sense to show what you are trying to get? May be create a dummy Excel sample where you can show the graph you are looking to get based on the sample data you provided or new sample data?

Not applicable
Author

Hi Sunny !

Sorry for the delay in responding !

So, i put attached a new excel file simplified : 3 raws, one for the name, one for the date, one for the money they have.

What i want to see is on the same visualization 1) the evolution of the money for each character along the time (from day -5 till day 0 = today) and 2) the confidence interval of 95% as we did before.

The 1), I did it easily : creating two dimensions Date and Name, and one measure : Sum(Money). As you can see on "Picture_1"

The 2), thanks to what you explained previously, I do it easily on another view, creating one dimension : Date, and two measures :

Avg(Aggr(Sum(Money), Date))  +  2 * Stdev(Aggr(Sum(Money), Date))

and

Avg(Aggr(Sum(Money), Date))  -  2 * Stdev(Aggr(Sum(Money), Date))

My question is now, how could I display this confidence interval on my visualization 1), to be fast able to see which is out of the confidence interval at a D date ? (as drawn on Picture_2, with in red the lower limit and green the upper limit of the 95% interval). In fact, i have already several dimensions (2 in this example), and I need several measures (3 in this example), which is impossible ...

Thanks so much for your great help and availability !

Picture 1

Picture_1.JPG

Picture 2

Picture_2.JPG

Not applicable
Author

Thanks for your help Sunny ! You can find all that in my message below

Not applicable
Author

Up ! Help pleaaaaase.

Thx so much in advance to those who can help me.