17 Replies Latest reply: Dec 21, 2016 7:13 PM by Charles Demarest

# Cumulative binomial distribution in equation editor

Hi, I am new to Qlik Sense and was not able to figure out how to calculate cumulative binomial distribution in the equation editor.  I would like to have this value in a table.

Qlik sense does not have discrete binomial distribution function either, but that can be calculated using existing functions as per below:

combin(n,x)*pow(0.05,x)*pow(0.95,n-x)

n and x are integers representing the number of observations and the number of statistical test exceedences at a 95% confidence interval.  The above formula works for discrete outcomes, for example, the probability of having exactly 5 out of 100 results be exceedences (n=100 and x=5 plugged into above), however, what I want is the cumulative probability of 5 or fewer exceedences, which would be the above with (n=100, x=0) + (n=100, x=1) + ... + (n=100, x=5).  It would seem like this could be achieved with a loop, but this does not seem to be available in the editor or an ability to create it in a custom function, something like:

cbd = 0

for i= 0 to x

cbd = cbd + combin(n,i)*pow(0.05,i)*pow(0.95,n-i)

next i

Any help with this would be appreciated

• ###### Re: Cumulative binomial distribution in equation editor

Would you be able to provide few rows of data with the numerical output you expect to see from it?

• ###### Re: Cumulative binomial distribution in equation editor

Hi,  thanks for the responses.  Note that I did find a way to get what I want into the report, but it involved some very convoluted SQL involving functions from this post (Cumulative distributive functions for normal and binomial).  I did not handle this on the Qlik side, although I believe it should be possible, so I still would be appreciative if someone could show a way to do it in Qlik (given the data in the table below).  Here is a contrived example, two years of monthly return data, 25 total observations.  There is a measure called VaR which asserts that if the risk model is accurate, the absolute value of the monthly return should exceed VaR in only about 5% of cases. In this example, there are two observations where the realized return exceed the VaR amount.  The cumulative binomial distribution is a hypothesis test saying that the probability of having 2 or fewer exceedences out of 25 observations (given an accurate model) is 87%.  That is the number I am trying to get.  In Excel, it is one formula as per below, in SQL server it is involves coding as per above link.  In Qlik, it is possible to get a discrete probability or the probability of having exactly 2 exceedences:

combin(n,x)*pow(0.05,x)*pow(0.95,n-x)

where n=25 and x=2.  If it would possible to take the aggregation of this formula for x=0, x=1, x=2, then I would get my answer:

Hope that more clear and thanks in advance for any help.

• ###### Re: Cumulative binomial distribution in equation editor

Would you be able to attach the Excel file here?

• ###### Re: Cumulative binomial distribution in equation editor

Hi Sunny,  sure, see attached.  I use random numbers for the returns in the example, so column D will change if you recalculate something or hit F9. Note that I am hard coding the 5% threshold, but the number of observations and exceedences will change for different funds / history used, so a solution would have to be flexible there.

• ###### Re: Cumulative binomial distribution in equation editor

I sort off understand the whole thing except these two numbers. What is the logic behind these two numbers?

• ###### Re: Cumulative binomial distribution in equation editor

Hi Sunny, Sorry that made the example a little more convoluted, those numbers are just parameterizing the random numbers (column D) so that they would be scaled similarly to the returns of one of the funds I am analyzing.  You can ignore those.

• ###### Re: Cumulative binomial distribution in equation editor

=Sum( combin(25,ValueLoop(0,2))*pow(0.05,ValueLoop(0,2))*pow(0.95,25-ValueLoop(0,2)) )

?

• ###### Re: Cumulative binomial distribution in equation editor

Super!! that did seem to have worked

How exactly is Combin (Combination) function works here?

• ###### Re: Cumulative binomial distribution in equation editor

Hi Stefan,

Thanks for the suggestion, it looked promising and works fine until I try to replace one of the parameters in the ValueLoop function with a variable (then I get "Error in expression").  The idea here is that the number of exceedences and the number of observations can change from query to query or fund to fund.  I consulted the docs on ValueLoop and did not see anything about whether or not it can take a variable, but when I try it does not seem to be allowed.

=Sum( combin(num_obs,ValueLoop(0,2))*pow(0.05,ValueLoop(0,2))*pow(0.95,num_obs-ValueLoop(0,2)) )

(OK)

=Sum( combin(num_obs,ValueLoop(0,var_exc))*pow(0.05,ValueLoop(0,2))*pow(0.95,num_obs-ValueLoop(0,2)) )

(Error in expression)

Let me know if you have any other ideas or if I'm doing something wrong.

• ###### Re: Cumulative binomial distribution in equation editor

This seems to be working with dollar sign expansion of the variable:

=Sum( combin(25,ValueLoop(0,\$(=var_exc)))*pow(0.05,ValueLoop(0,\$(=var_exc)))*pow(0.95,25-ValueLoop(0,\$(=var_exc))))

• ###### Re: Cumulative binomial distribution in equation editor

Thanks for the reply.  I was unable to get this approach to work and will come back to it later.  My version of Qlik Sense hassles me when I try to open a file someone else created, so I did not open your file.  I'll read up on the dollar sign expansion and equal sign notation in your formula, in the meantime I will use the SQL-Server solution.

• ###### Re: Cumulative binomial distribution in equation editor

The parameters to valueloop do not allow variables or expressions, anything that need to be evaluated, AFAIR.

But you can use dollar sign expansions like a preprocessor text replacement.

The correct syntax may be dependent on how you are defining your variables (talking of QV variables, not statistics).

The Magic of Variables

The Magic of Dollar Expansions

If you intend to use this as part of a table chart and you need to have the valueloop()  be based on a dimensional field value, then I don't think this will work.

• ###### Re: Cumulative binomial distribution in equation editor

Thanks for your help in any case.  This is what I am doing.  I am only wanting the value to show up in a table.  Current approach works as per below, just thought might be more elegant way to do it in QS.

• ###### Re: Cumulative binomial distribution in equation editor

Seems like you are calculating your numbers currently in the script using the SQL code, and you want to replace it with QS code.

Try

Input:

pDate,

var_up,

return,

num_obs,

var_exc

FROM

[.\BinomCDFexample.xlsx]

(ooxml, embedded labels, header is 3 lines, table is Sheet1)

WHERE LEN(TRIM(portfolio));

Stats:

TotalExceedence,

NumObservations,

Sum(Binom) as CumBinom

Group by portfolio, TotalExceedence, NumObservations;

TotalExceedence,

NumObservations,

combin(NumObservations,iterno()-1)*pow(0.05,iterno()-1)*pow(0.95,NumObservations-(iterno()-1)) as Binom

While iterno()-1 <= TotalExceedence;

sum(var_exc) as TotalExceedence,

Count(portfolio) as NumObservations

RESIDENT Input

Group by portfolio;

• ###### Re: Cumulative binomial distribution in equation editor

Thanks Stefan, you helped me figure out how to do it without using the SQL function.  It seems though, that Qlik does not let you do "resident" style manipulations on data loaded from a select statement (Loading data from a previously loaded table ‒ Qlik Sense)

Extract from link: "Example 1: Transforming data loaded by a SELECT statement

If you load data from a database using a SELECT statement, you cannot use Qlik Sense functions to interpret data in the SELECT statement. The solution is to add a LOAD statement, where you perform data transformation, above the SELECT statement."

However, it seems I can do it with two SQL statements as per below. One caveat is that when using SummarySet data in the table, I use AVG() for everything except Binom where I use SUM().

LIB CONNECT TO 'SQL2012PROD';

//full set with data for chart
FullSet:
pDate,
var_up,
var_down,
Capital,
pct_return

sql select A.portfolio,
pDate,
Val_At_Risk_Pct * 2 as var_up,
val_at_risk_pct * -2 as var_down,
pct_return,
Capital,

from
{SQL omitted};

//summary set for data with table

SummarySet:
Observations,
PositiveExc,
NegativeExc,
TwoTailExc,
combin(Observations,iterno()-1)*pow(0.05,iterno()-1)*pow(0.95,Observations-(iterno()-1)) as Binom

While iterno()-1 <= TwoTailExc;

sql select portfolio,
sum(num_obs) as Observations,
sum(var_pos_exc) as PositiveExc,
sum(var_neg_exc) as NegativeExc,
sum(var_exc)  as TwoTailExc
from  {SQL omitted}

group by portfolio;