Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
pgrenier
Partner - Creator III
Partner - Creator III

Looking for integrated statistical functions

Good morning,

I am currenlty trying to get a firm grasp on QlikView's capabilities to formulate statistical expressions, and have found out a few interesing functions under the Statistical Aggregation Functions in Charts and Statistical Distribution Functions

sections in the inline help. Up to now, I am basically quite confident that the following needs are answered (my question is at the end):

Kurtosis

kurtosis([{set_expression}][ distinct ] [ total [<fld { , fld } >] ] expression)

Returns the aggregated kurtosis of expression or field iterated over the chart dimension(s).

This function has the same limitations for nested aggregation as the avg([{set_expression}] [ distinct ] [ total [<fld { , fld } >]] expression) function. The kurtosis function supports Set Analysis and the total qualifier in the same way as the avg([{set_expression}] [ distinct ] [ total [<fld { , fld } >]] expression) function.

Examples:

kurtosis(Sales)

kurtosis(X'Y/3)

kurtosis(distinct Price)

kurtosis(total Sales)

kurtosis({1} total Sales)

Median

median ([{set_expression}] [ distinct ] [ total [<fld {,fld}>] ] expression )

Returns the aggregated median of expression iterated over the chart dimension(s).

This function has the same limitations for nested aggregation as the avg([{set_expression}] [ distinct ] [ total [<fld { , fld } >]] expression) function. The median function supports Set Analysis and the total qualifier in the same way as the avg([{set_expression}] [ distinct ] [ total [<fld { , fld } >]] expression) function.

Examples:

median( X )    

median( X*Y/3 )    

median( total X )    

median( total <Group> Price )

Standard Deviation

stdev([{set_expression}][ distinct ] [ total [<fld { , fld } >] ] expression)

Returns the aggregated standard deviation of expression or field iterated over the chart dimension(s).

This function has the same limitations for nested aggregation as the avg([{set_expression}] [ distinct ] [ total [<fld { , fld } >]] expression) function. The stdev function supports Set Analysis and the total qualifier in the same way as the avg([{set_expression}] [ distinct ] [ total [<fld { , fld } >]] expression) function.

Examples:

stdev(Sales)

stdev(X'Y/3)

stdev(distinct Price)

stdev(total Sales)

stdev({1} total Sales)

Mean

avg([{set_expression}] [ distinct ] [ total [<fld { , fld } >]] expression)

Returns the aggregated average of expression or field iterated over the chart dimension(s). [...]

If the word distinct occurs before the function arguments, duplicates resulting from the evaluation of the function arguments will be disregarded.

If the word total occurs before the function arguments the calculation will be made over all possible values given the current selections but disregarding the chart dimension variables.

The total qualifier may be followed by a list of one or more field names within angle brackets. These field names should be a subset of the chart dimension variables. In this case the calculation will be made disregarding all chart dimension variables except those listed, i.e. one value will be returned for each combination of field values in the listed dimension fields. Also fields which are not currently a dimension in a chart may be included in the list. This may be useful in the case of group dimensions, where the dimension fields are not fixed. Listing all of the variables in the group causes the function to work when the cycle or drill-down level changes.

Examples:

avg(Sales)

avg(X'Y/3)

avg(distinct Price)

avg(total Sales)

avg({1} total Sales)

Distribution

normdist (value, mean, standard_dev)

returns the cumulative normal distribution for the specified mean and standard deviation. Value is the value at which you want to evaluate the distribution. Mean is a value stating the arithmetic mean for the distribution. Standard_dev is a positive value stating the standard deviation of the distribution. All arguments must be numeric, else null will be returned. If mean = 0 and standard_dev = 1, the function returns the standard normal distribution. This function is related to the norminv (prob, mean, standard_dev) function in the following way:

If prob = normdist(value, m, sd), then norminv(prob, m, sd) = value.

Example:

normdist( 0.5, 0, 1 ) returns 0.6914625

At this point though, I haven't found any information or functions related to Symmetry analysis, or for central tendency and dispersion measures, Mode and Variance.

Does anyone have any hints or pointers on this subject matter?

Thanks in advance for your time, regards,

Philippe

1 Solution

Accepted Solutions
Clever_Anjos
Employee
Employee

Maybe

skew([ distinct] expression)

Returns the skewness of expression over a number of records as defined by a group by clause. If the word

distinct occurs before the expression, all duplicates will be disregarded.

Example:

Load Month, skew(Sales) as SalesSkew from abc.csv group by Month;

and

kurtosis([distinct ] expression )

Returns the kurtosis of expression over a number of records as defined by a group by clause. If the word distinct

occurs before the expression, all duplicates will be disregarded.

Example:

Load Month, kurtosis(Sales) as SalesKurtosis from abc.csv group by

Month;

can help you

View solution in original post

7 Replies
Clever_Anjos
Employee
Employee

QlikView provides StdDev, so

variance = pow(StdDev,2)

and provides Mode() function too

pgrenier
Partner - Creator III
Partner - Creator III
Author

Thank you Clever Anjos

Now, all that's left to figure out, is how to evaluate the Symmetry!

Cheers,

Philippe

Clever_Anjos
Employee
Employee

I´m not familiar with Symmetry analysis.

Clever_Anjos
Employee
Employee

Maybe

skew([ distinct] expression)

Returns the skewness of expression over a number of records as defined by a group by clause. If the word

distinct occurs before the expression, all duplicates will be disregarded.

Example:

Load Month, skew(Sales) as SalesSkew from abc.csv group by Month;

and

kurtosis([distinct ] expression )

Returns the kurtosis of expression over a number of records as defined by a group by clause. If the word distinct

occurs before the expression, all duplicates will be disregarded.

Example:

Load Month, kurtosis(Sales) as SalesKurtosis from abc.csv group by

Month;

can help you

pgrenier
Partner - Creator III
Partner - Creator III
Author

The Symmetry analysis is a type of statistical distance measure. Here's a excerpt from Wikipedia on the subject:

In statistics, probability theory, and information theory, a statistical distance quantifies the distance between two statistical objects, which can be two random variables, or two probability distributions or samples, or the distance can be between an individual sample point and a population or a wider sample of points.

A distance between populations can be interpreted as measuring the distance between two probability distributions and hence they are essentially measures of distances between probability measures. Where statistical distance measures relate to the differences between random variables, these may have statistical dependence, and hence these distances are not directly related to measures of distances between probability measures. Again, a measure of distance between random variables may relate to the extent of dependence between them, rather than to their individual values.

Statistical distance measures are mostly not metrics and they need not be symmetric. Some types of distance measures are referred to as (statistical) divergences.

Metrics

A metric on a set X is a function (called the distance function or simply distance)

d : X × XR (where R is the set of real numbers). For all x, y, z in X, this function is required to satisfy the following conditions:

  1. d(x, y) ≥ 0     (non-negativity)
  2. d(x, y) = 0   if and only if   x = y     (identity of indiscernibles. Note that condition 1 and 2 together produce positive definiteness)
  3. d(x, y) = d(y, x)     (symmetry)
  4. d(x, z) ≤ d(x, y) + d(y, z)     (subadditivity / triangle inequality).

Source: Statistical distance - Wikipedia, the free encyclopedia

pgrenier
Partner - Creator III
Partner - Creator III
Author

Apparently, from what I've read, Symmetry can be derived in the analytics when skewness = 0

Many thanks Clever Anjos!

Clever_Anjos
Employee
Employee

So checking if skewness = 0 will show you the symmetry