Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
michael_andrews
Partner - Creator
Partner - Creator

Equivalent to Variance Function?

I'm wondering if there is some sort of equivalent to the variance sql function? I.e., a way for me to get 1.66 from this data as you would if you ran this sql

CREATE TABLE #foo(f_val int)

INSERT INTO #foo VALUES(1), (2), (3), (4)

SELECT var(f_val) FROM #foo

1 Solution

Accepted Solutions
swuehl
MVP
MVP

This leads away from the question how you can calculate variance in Qlik, but ok...

If I understand your request correctly, you are basically missing the part:

" get the variance for each ItemScore Grouped by their name. " And then average this list of values.

I think you already know how to do this in principle

Average a Sum of Child Records

=Avg ( Aggr( Sqr(Stdev(  i_item_score )), i_item_name))

AGGR...

When should the Aggr() function NOT be used?

Use Aggregation Functions!

Pitfalls of the Aggr function

View solution in original post

15 Replies
swuehl
MVP
MVP

Try

LOAD Sqr(Stdev(Val)) as Var;

LOAD * INLINE [

Val

1

2

3

4

];

michael_andrews
Partner - Creator
Partner - Creator
Author

So I'd have to add this to my load script? And if that's the case, how would I go about doing it on a column?

swuehl
MVP
MVP

Calculating a variance is like applying an aggregation function to a set of field values / records.

If you want to calculate a variance per some key, you would need to use a group by clause, just like you would if you want to Sum() or Average() your values.

In above sample, I haven't used another field to group by, so I don't need the group by clause (just like in your SQL).

michael_andrews
Partner - Creator
Partner - Creator
Author

I'll play with it some. I guess my bigger question was can I use that LOAD in the function of the measure, or do I have to add this aggregate thing as a new column in the data itself via the load script?

swuehl
MVP
MVP

You can use the same expression

=Sqr(Stdev( MEASUREFIELD))

as expression in a front end chart. Stdev() is a statistical aggregation function, hence you can use it same way you would use a Sum(), Avg() or Median().

Use one or more dimension(s) in your chart if you need to get your aggregate grouped by.

michael_andrews
Partner - Creator
Partner - Creator
Author

Ok, wondering if I can pick your brain on something a little more specific towards this. So the final calculation I'm trying to get is the Alpha Reliability of exam questions. Here's the sql that we use.

SELECT @itemVarianceAvg = AVG(is_item_variance)

  FROM

  (

  SELECT VAR(i_item_score) AS is_item_variance

  , i_item_name

  FROM #items

  WHERE i_exam_form = @currentForm

  GROUP BY i_item_name

  ) AS S

  WHERE 1=1;

So here, what we're basically saying is get the variance for each ItemScore Grouped by their name. To put it in context. People can take exams, each exam can have multiple items associated with it, but they will always have a unique name. So, we are getting all of the variances grouped by item, then averaging that. That is the @itemVarianceAvg

Now after that, we get this alpha reliability number by so

  SELECT @tempNumeric = ((CAST(@tempInt AS float) / (CAST(@tempInt AS float) - 1))

  * (1 - ((@itemVarianceAvg*@tempInt) / POWER(@stDevScore, 2))));

First thing is you can assume I already know the @tempInt number. Also, I know the proper function for the @stDevScore, but I'm trying to wrap this all into one line and I'm not getting things right. Here's what I have is this

(Count(ItemMeasure)/Count(ExamMeasure) - (Count(ItemMeasure)/Count(ExamMeasure) - 1))

*(1-((Sqr(Stdev(ItemScore))*Count(ItemMeasure)/Count(ExamMeasure)))/pow(stdev(ExamScore),2))

Also, I read up on variables a little, but I'm confused on how I could define say the @tempInt variable, since I have to resuse the line Count(ItemMeasure)/Count(ExamMeasure)

swuehl
MVP
MVP

This leads away from the question how you can calculate variance in Qlik, but ok...

If I understand your request correctly, you are basically missing the part:

" get the variance for each ItemScore Grouped by their name. " And then average this list of values.

I think you already know how to do this in principle

Average a Sum of Child Records

=Avg ( Aggr( Sqr(Stdev(  i_item_score )), i_item_name))

AGGR...

When should the Aggr() function NOT be used?

Use Aggregation Functions!

Pitfalls of the Aggr function

michael_andrews
Partner - Creator
Partner - Creator
Author

Well yea, my original variance question was me breaking this larger problem down into smaller portions. I'll look at this and get back to you. thanks.

michael_andrews
Partner - Creator
Partner - Creator
Author

Hey, thanks for your help, the Aggr chunk got me headed in the right direction. Is it possible to declare variables at all in that fx window? I got the calculation working, but I have to reuse a lot of the same calculations, and was wondering if I can create a local variable in that window, and just use it. This is it

(Count(ItemMeasure)/Count(ExamMeasure) / (Count(ItemMeasure)/Count(ExamMeasure) - 1)) *

(1-(((Avg(Aggr(Sqr(Stdev(ItemScore)), ItemName))) * (Count(ItemMeasure)/Count(ExamMeasure))) / (pow(stdev(ExamScore), 2))))

And so, for example, if there was a way for me to set Count(ItemMeasure)/Count(ExamMeasure) to a variable named say, @count (or however I'd declare it), it'd be a lot cleaner

(@count / @count - 1)) *

(1-(((Avg(Aggr(Sqr(Stdev(ItemScore)), ItemName))) * @cunt) / (pow(stdev(ExamScore), 2))))