Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Everyone,
I'm new to Qlik sense and wanted to understand how to write set analysis formula for my situation. I have below data.
ID | Product | Spec | Value |
1 | A | SA1 | 10.5 |
2 | A | SA1 | 10.7 |
3 | A | SA2 | 55 |
4 | A | SA2 | 100 |
5 | A | SA2 | 50.5 |
6 | B | SB1 | 0.1 |
7 | B | SB1 | 0.2 |
8 | B | SB1 | 0.14 |
9 | B | SB1 | 0.22 |
10 | B | SB2 | 10 |
11 | B | SB2 | 11.7 |
12 | B | SB2 | 12 |
I need to gat the count of records which fall outside the range between Average+-Standard Deviation for each Product and Spec combination. The manual calculation is given below and I have highlighted the required output in bold.
Product | Spec | Average | Standard Dev | Avg - ( Std Dev) | Avg + (Std Dev) |
A | SA1 | 10.600 | 0.141 | 10.459 | 10.741 |
A | SA2 | 68.500 | 27.372 | 41.128 | 95.872 |
B | SB1 | 0.165 | 0.055 | 0.110 | 0.220 |
B | SB2 | 11.233 | 1.079 | 10.155 | 12.312 |
Product | Spec | Total Records | Deviated Records |
A | SA1 | 2 | 0 |
A | SA2 | 3 | 1 |
B | SB1 | 4 | 0 |
B | SB2 | 3 | 0 |
Please help me understand on how to write an expression for this calculation. Thanks for the help in advance.
You need the values for each Product and Spec combination. Since a set analysis is evaluated for the entire chart, it wont be of any use here.
Btw: You missed a few deviated records in your example. Both Specs for Product B should have 1 each.
If possible, I would do what @steeefan recommended and do the calculation in the script:
Data:
NoConcatenate Load * Inline [
ID Product Spec Value
1 A SA1 10.5
2 A SA1 10.7
3 A SA2 55
4 A SA2 100
5 A SA2 50.5
6 B SB1 0.1
7 B SB1 0.2
8 B SB1 0.14
9 B SB1 0.22
10 B SB2 10
11 B SB2 11.7
12 B SB2 12
] (delimiter is ' ');
Join(Data) Load Distinct
Product,
Spec,
Average + Deviation as UpperBound,
Average - Deviation as LowerBound;
Load
Product,
Spec,
Avg(Value) as Average,
StDev(Value) as Deviation
Resident Data
Group By
Product,
Spec;
Final:
NoConcatenate Load
ID,
Product,
Spec,
Value,
If(Value < LowerBound or Value > UpperBound, 1, 0) as Deviation_Flag
Resident Data;
Drop Table Data;
It's possible in the frontend too. But the expression can be quite performance intense:
Sum(
Aggr(
If(Value < Avg(total <Product, Spec> Value) - StDev(total <Product, Spec> Value)
or Value > Avg(total <Product, Spec> Value) + StDev(total <Product, Spec> Value),
1
),
ID, Product, Spec
)
)
My suggestion is to do that in the script rather than in set analysis. What's your reasoning?
Hi @steeefan , Doing it in load script will not help me achieve my goal. I have a date filter on my sheet. I want the calculation to be dynamic. The average and standard deviation calculation will change based on the filter. And I believe doing it in load script will make the calculation static. Please correct me if I'm wrong.
You need the values for each Product and Spec combination. Since a set analysis is evaluated for the entire chart, it wont be of any use here.
Btw: You missed a few deviated records in your example. Both Specs for Product B should have 1 each.
If possible, I would do what @steeefan recommended and do the calculation in the script:
Data:
NoConcatenate Load * Inline [
ID Product Spec Value
1 A SA1 10.5
2 A SA1 10.7
3 A SA2 55
4 A SA2 100
5 A SA2 50.5
6 B SB1 0.1
7 B SB1 0.2
8 B SB1 0.14
9 B SB1 0.22
10 B SB2 10
11 B SB2 11.7
12 B SB2 12
] (delimiter is ' ');
Join(Data) Load Distinct
Product,
Spec,
Average + Deviation as UpperBound,
Average - Deviation as LowerBound;
Load
Product,
Spec,
Avg(Value) as Average,
StDev(Value) as Deviation
Resident Data
Group By
Product,
Spec;
Final:
NoConcatenate Load
ID,
Product,
Spec,
Value,
If(Value < LowerBound or Value > UpperBound, 1, 0) as Deviation_Flag
Resident Data;
Drop Table Data;
It's possible in the frontend too. But the expression can be quite performance intense:
Sum(
Aggr(
If(Value < Avg(total <Product, Spec> Value) - StDev(total <Product, Spec> Value)
or Value > Avg(total <Product, Spec> Value) + StDev(total <Product, Spec> Value),
1
),
ID, Product, Spec
)
)
Thanks @LRuCelver for the quick help... I tried using the formula in the front end and it worked. This way I'll keep an eye on the dashboard performance. Thanks for the intimation.
Hi @LRuCelver ,
I had to make slight changes In the calculation. I created expressions based on the solution shared by you but was stuck at one point. Requesting your help here.
The change is in the calculation for Standard Deviation. As per the new logic, I need to calculate moving range as shown below.
ID | Product | Spec | Value | Moving range Abs(Current-Previous) Sorted by date |
1 | A | SA1 | 10.5 | |
2 | A | SA1 | 10.9 | 0.4 |
3 | A | SA2 | 55 | |
4 | A | SA2 | 100 | 45 |
5 | A | SA2 | 50.5 | 49.5 |
6 | B | SB1 | 0.5 | |
7 | B | SB1 | 0.6 | 0.1 |
8 | B | SB1 | 0.14 | 0.46 |
9 | B | SB1 | 0.22 | 0.08 |
10 | B | SB2 | 10 | |
11 | B | SB2 | 11.7 | 1.7 |
12 | B | SB2 | 30 | 18.3 |
The new standard deviation is the sum of the moving range divided by a constant as shown below. The expected count of deviated records is also highlighted.
New Std Dev | |||||||
Product | Spec | Average | Sum of Moving Range/(1.128* (Total Records-1)) | Average - New Std Dev | Average+New Std Dev | Total Records | Deviated Records |
A | SA1 | 10.6 | 0.355 | 10.2 | 11.0 | 2 | 0 |
A | SA2 | 68.5 | 70.922 | -2.4 | 139.4 | 3 | 0 |
B | SB1 | 0.165 | 1.702 | -1.5 | 1.9 | 4 | 0 |
B | SB2 | 11.233 | 18.794 | -7.6 | 30.0 | 3 | 0 |
I created the expressions for Moving Range and new standard deviation which is working fine. Only thing not working is the expression for deviated records.
Moving Range Sum :
sum(aggr( fabs(above(sum(Value)) - sum(Value)) ,Product,Spec,([Date], Numeric, Descending) ))
/
(Aggr(Count(ID),Product,Spec)-2+1)
Requesting your help to modify the formula you shared to updated the calculation for Standard Deviation.