Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
thooper
Contributor
Contributor

How to evenly distribute values into quartiles, help with adapting existing script from using FRACTILE

Hi all,

I've been trying to adapt an existing Qlik App to use a more even representation of equal quartile groups, instead of Fractile which is troublesome IF you happen to have a large portion with the same values.

I took the advice offered here - How to distribute values into quartiles evently? - Qlik Community - 1344684

But have still struggled to get it to work properly.

In our app we are using Fractile like this (4 variables);

=FraCTile({<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY, 0.25)

And then using this in set analysis in a few diff ways like this;

Num(Avg({<HOURLY_TOTAL_PAY ={">0<=$(=vQuatile1)"}, EXCLUSION_REASON = {'Include'}>}HOURLY_TOTAL_PAY), '###,###.##')

I can get the rank logic to work as a variable/column in a table, but can't get it to work in the set analysis (of which there are many), I'm assuming because of the order calculations are completed?

This works perfectly as the column from a variable.

if(ceil(rank(total only({<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY))
/ (count(total {<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY) / 4)) = 4, '1',
if(ceil(rank(total only({<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY))
/ (count(total {<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY) / 4)) = 3, '2',
if(ceil(rank(total only({<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY))
/ (count(total {<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY) / 4)) = 2, '3',
if(ceil(rank(total only({<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY))
/ (count(total {<EXCLUSION_REASON = {'Include'}, HOURLY_TOTAL_PAY = {">0"}>}HOURLY_TOTAL_PAY) / 4)) = 1, '4',
'5'))))

Finally, bonus points if there's a way we can evenly split things across Male/Female - using these rules (appreciate this might not be possible, or difficult at best);

 

"If there are a number of employees on the exact same hourly rate of pay crossing between two of the quartiles, make sure that the males and females are split as evenly as possible across the quartiles. For e.g. 40 staff all have the same hourly rate of pay. 36 are female and 4 are male. Of them, 10 have fallen into the lower quartile, while 30 have fallen into the lower middle quartile. To evenly distribute these staff by gender the employer can see that this means for every 9 females listed, 1 male should be listed with them"

 

Thanks

 

Tom

Labels (1)
2 Replies
thooper
Contributor
Contributor
Author

Hi! Still wondering if anyone can help with this 🙂

Ben_P
Creator II
Creator II

Hi Tom.
As discussed privately, there are several ways to ensure equally-sized quartile groups. The simplest option is probably to do this in the load script, by sorting your table by hourly pay, and adding a row counter as the ranking field. If you want to sort equal pay records in a certain way, then add a second sorting criterion, e.g. Employee_Id. In order to sort the table, you'd have to load it first, then run a resident load against it specifying sort order, as below. Then in your formulae in charts etc., you'd split the records into quartiles using the Ranking_Field, and return the hourly pay values associated with those records.
 
Temp_table:
NOCONCATENATE LOAD
*,
rowno() as Ranking_Field
RESIDENT Your_table
ORDER BY HOURLY_TOTAL_PAY ASC, Employee_Id ASC
;
DROP TABLE Your_table;
RENAME TABLE Temp_table TO Your_table;
 
Regarding the second point about the male/female split, I don't think this is possible. We could do it in the load script, but then this wouldn't be dynamic, so wouldn't work in the front-end, e.g. if 1/3 records of equal pay are female, then your unfiltered ranking in the load script might run FMMFMMFMM... but this wouldn't work dynamically once the user started filtering in the front-end, if they filtered for records 2,3,4,7 in that subset, it would return a ranked result of MMFF and break the gender balance.