Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Fractile function

Hi everyone,

I have a list of customers with profits associated. I want to use the fractile function. I was thinking that the fractile function take the 10 first % of my clients and calculated the profit. but in fact, it takes the 10% of profits.

How can I have this :

0-10% of best clients profit | number of customers (and each line contains the same number of customers)

10-20% |  1 759

20-30% |  1 759

30-40% |  1 759

...

Thanks!

1 Solution

Accepted Solutions
swuehl
MVP
MVP

I think it's quite clear. I see what you are talking about.

Please note that the fractile function itself is not dividing your data set into equal parts, it's just returning the threshold value that divides your data set into equal parts.

The issue arises if you have clustered data (like in your case, your data is clustered around the integer values or values with a single decimal, create a histogram or a list box with frequency count to see what I mean).

If a fractile calculations returns one of the clustered values, your chart will count all of these records even if only part of should be counted.

To illustrate:

LOAD *, recno() as Value2, 1 as Value3 INLINE [

Value

2

2

2

2

2

8

9

3

4

1

2

];

This creates three different distibutions, try calculating your deciles for all three and check the outcome.

Hope this helps,

Stefan

edit: using rank() in the calculated dimension might be get you a bit closer to what you want to achieve

=aggr(if(rank(sum(Profit),0,4)<0.1*count(total Customer), '90-100%',

if(rank(sum(Profit),4)<0.2*count(total Customer), '80-90%',

if(rank(sum(Profit),4)<0.3*count(total Customer), '70-80%',

if(rank(sum(Profit),4)<0.4*count(total Customer), '60-70%',

if(rank(sum(Profit),4)<0.5*count(total Customer), '50-60%',

if(rank(sum(Profit),4)<0.6*count(total Customer), '40-50%',

if(rank(sum(Profit),4)<0.7*count(total Customer), '30-40%',

if(rank(sum(Profit),4)<0.8*count(total Customer), '20-30%',

if(rank(sum(Profit),4)<0.9*count(total Customer), '10-20%'

, '0-10%'))))))))),Customer)

View solution in original post

7 Replies
swuehl
MVP
MVP

Sorry, I haven't got what you are trying to accomplish.

If you are in doubt, what a fractile calculation does, please check:

http://en.wikipedia.org/wiki/Fractile

or start with the well known special case of a median:

http://en.wikipedia.org/wiki/Median

If you want to get all customers per line:

=count(TOTAL customer)

or

=count(TOTAL customer) / 10

to get your required 1759 (i.e. count of TOTAL divided by number of your classes)

Not applicable
Author

thanks for your answer

I was using the count(customer) to check the result of fractile function.

I was thinking that if I use fractile(), each lines will have the same number of customers. But it's not.

"In descriptive statistics, any of the nine values that divide the sorted data into ten equal parts"

And it's false if i used the fractile function, there is not ten equal parts

Not applicable
Author

Any help please ?

Is it clear?

swuehl
MVP
MVP

I think it's quite clear. I see what you are talking about.

Please note that the fractile function itself is not dividing your data set into equal parts, it's just returning the threshold value that divides your data set into equal parts.

The issue arises if you have clustered data (like in your case, your data is clustered around the integer values or values with a single decimal, create a histogram or a list box with frequency count to see what I mean).

If a fractile calculations returns one of the clustered values, your chart will count all of these records even if only part of should be counted.

To illustrate:

LOAD *, recno() as Value2, 1 as Value3 INLINE [

Value

2

2

2

2

2

8

9

3

4

1

2

];

This creates three different distibutions, try calculating your deciles for all three and check the outcome.

Hope this helps,

Stefan

edit: using rank() in the calculated dimension might be get you a bit closer to what you want to achieve

=aggr(if(rank(sum(Profit),0,4)<0.1*count(total Customer), '90-100%',

if(rank(sum(Profit),4)<0.2*count(total Customer), '80-90%',

if(rank(sum(Profit),4)<0.3*count(total Customer), '70-80%',

if(rank(sum(Profit),4)<0.4*count(total Customer), '60-70%',

if(rank(sum(Profit),4)<0.5*count(total Customer), '50-60%',

if(rank(sum(Profit),4)<0.6*count(total Customer), '40-50%',

if(rank(sum(Profit),4)<0.7*count(total Customer), '30-40%',

if(rank(sum(Profit),4)<0.8*count(total Customer), '20-30%',

if(rank(sum(Profit),4)<0.9*count(total Customer), '10-20%'

, '0-10%'))))))))),Customer)

Not applicable
Author

thanks a lot for your help!

Not applicable
Author

There is an error in the script :

It works with rank(sum(profit), 4, 1)

=aggr(if(rank(sum(Profit),4, 1)<0.1*count(total Customer), '90-100%',

if(rank(sum(Profit),4, 1)<0.2*count(total Customer), '80-90%',

if(rank(sum(Profit),4, 1)<0.3*count(total Customer), '70-80%',

if(rank(sum(Profit),4, 1)<0.4*count(total Customer), '60-70%',

if(rank(sum(Profit),4, 1)<0.5*count(total Customer), '50-60%',

if(rank(sum(Profit),4, 1)<0.6*count(total Customer), '40-50%',

if(rank(sum(Profit),4, 1)<0.7*count(total Customer), '30-40%',

if(rank(sum(Profit),4, 1)<0.8*count(total Customer), '20-30%',

if(rank(sum(Profit),4, 1)<0.9*count(total Customer), '10-20%'

, '0-10%'))))))))),Customer)

thanks a lot !

Rajesh31
Contributor III
Contributor III

why are you using 4,1 , what does it signifies