Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

cross table with percentile

I tried to get a simple table with two fields: [price offered] and [response]. Price offered ranges from 1.0 to 100.0. Response is either accept or reject (1 or 0). In the table, I have one dimension using [price offered]. Because there are many different price values, I'd like to use percentiles to cut the population into 10 buckets. So the final table looks like:

# of response
price_offered# accept# reject
[1, pctl_10]100200
[pctl_10,pctl_20]3950
[pctl_90,100]30005000


My question is: how to get all the percentile values dynamically in the table (these values will change automatically with selection).

Any suggestion? Thanks!

5 Replies
johnw
Champion III
Champion III

Use an island dimension like price_percentile, varying from 0.1 to 1.0 and displayed as a percentage. Or maybe just a value loop calculated dimension. Then I suspect this or something close would work as your expression.

sum(if(price_offered >= fractile(total price_offered,price_percentile - 0.1)
and price_offered <= fractile(total price_offered,price_percentile), response))

If you can't get something along those lines to work, let me know, and I'll try to put together an example or come up with another solution if it really won't work at all.

Warning: if your data set is large, the performance of a sum(if()) can be poor. The only way I'm currently thinking to address that is fairly complicated.

Edit: Not sure you're up on common forum terminology. By "island dimension" I mean that you'd have a table with no connection to any other tables, containing only this dimension and its values. It's an island because it's surrounded by nothing, by no other tables. Something like this:

Percentiles:
LOAD recno()/10 as price_percentile
AUTOGENERATE 10;

Not applicable
Author

I did try something similar with fractile and even set operator. But the problem is, my data is huge (>2G). I guess that's reason why QV crashed. Any way to avoid the memory crash?

I'm not sure about "island dimension". Is it dynamically calculated everytime I make a selection?

Thanks!

johnw
Champion III
Champion III

The fractiles are dynamically recalculated every time you make a selection. The dimension itself is just the values 10% to 100%, so there's no recalculation in that regard. But with performance being an issue, just ignore my previous suggestion. It won't perform.

Set analysis fixes that problem as best we can, I think. You'd have ten expressions instead of one, each referencing their specific fractile range. But most fractile values get computed twice (and I worry possibly more than twice if QlikView isn't being smart about things).

I think you can force QlikView to only calculate the fractile values once (though recalculating when you make selections) by making them all variables. Just generate the variables in script, I'd say, since you have even ranges. Something like this:

FOR I = 10 TO 90 STEP 10
LET vP$(I) = '=fractile(price_offered,' & $(I)/100 & ')'
NEXT

That should avoid about half of the fractile calculations, and potentially a lot more. So then your expressions are like this:

sum({<price_offered={ "<=$(vP10)"}>} response)
sum({<price_offered={">$(vP10) <=$(vP20)"}>} response)
sum({<price_offered={">$(vP20) <=$(vP30)"}>} response)
...
sum({<price_offered={">$(vP90)" }>} response)

I'm sure I've made some syntax or whatever errors, but that would be the idea. So you're forcing the fractile values to be calculated only once, then you're using set analysis to split up your data set very efficiently, and finally you're summing response in each subset of the data. If that doesn't work, I'm not thinking of anything faster.

Not applicable
Author

John,

It works fine! I appreciate your help!

I would suggest adding this as a feature to QV. In a chart, if one dimension is numeric with tons of distinct values, QV should group them into equal bins by default. End users can select different cut points as an option. That would be great.

Thanks!

johnw
Champion III
Champion III


largezoo wrote:I would suggest adding this as a feature to QV. In a chart, if one dimension is numeric with tons of distinct values, QV should group them into equal bins by default. End users can select different cut points as an option. That would be great.


Well, we wouldn't want to group them into bins by default. Even if most developers wanted that, we can't make it a default for backwards-compatibility reasons, and I'm not sure most developers would want that.

But I agree it would make a nice option. Do you have access to the customer portal? You could suggest that as a new feature that you'd like to see in a future version of the product.