Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi.
I'm trying to use the fractile function to create a dimension that would distribute a series of userIDs into quartiles (Q1-Q4 as a string) The table underneath is the expected result:
Min | Q1 | Q2 | Q3 | Max |
20000 | 24000 | 25500 | 32750 | 50000 |
UserID | Salary | Quartile |
1 | 25000 | Q2 |
2 | 30000 | Q3 |
3 | 25000 | Q2 |
4 | 20000 | Q1 |
5 | 24000 | Q1 |
6 | 22500 | Q1 |
7 | 23500 | Q1 |
8 | 35000 | Q4 |
9 | 40000 | Q4 |
10 | 42000 | Q4 |
11 | 28000 | Q3 |
12 | 24000 | Q1 |
13 | 30000 | Q3 |
14 | 23000 | Q1 |
15 | 24500 | Q2 |
16 | 40000 | Q4 |
17 | 50000 | Q4 |
18 | 25000 | Q2 |
19 | 26000 | Q3 |
20 | 32000 | Q3 |
I'm currently using this code to achieve this but i'm not getting the string values that i could use as a dimension:
(fractile([Salary],0.25)) as "Q-0.25"
,
(fractile([Salary],0.50)) as "Q-0.50"
,
(fractile([Salary],0.75)) as "Q-0.75"
,
(fractile([Salary],1)) as "Q-1"
group by "UserID"
;
I've also tried:
IF (
fractile([Salary],0.00) <
(fractile([Salary],0.25)), 'Q-0.25'
,
IF (
fractile([Salary],0.26) <
(fractile([Salary],0.50)), 'Q-0.50'
,
IF (
fractile([Salary],0.51) <
(fractile([Salary],0.75)), 'Q-0.75'
,
IF (
fractile([Salary],0.76) <
(fractile([Salary],1)), 'Q-1'
)
)))
as Quartile_Salary,
Any ideas how i should aproach this?
Thanks in advance, any help is much appreciated.
Best regards.
Hi,
If you are doing this in script I think you need to calculate the quartiles then join back before doing a new load to assign, so like this;
data:
load * inline [
UserID, Salary
1, 25000
2, 30000
3, 25000
4, 20000
5, 24000
6, 22500
7, 23500
8, 35000
9, 40000
10, 42000
11, 28000
12, 24000
13, 30000
14, 23000
15, 24500
16, 40000
17, 50000
18, 25000
19, 26000
20, 32000
];
join (data)
Load
Fractile(Salary,0.25) AS Q1,
Fractile(Salary,0.5) AS Q2,
Fractile(Salary,0.75) AS Q3
Resident data;
data_final:
NoConcatenate
Load
UserID,
Salary,
If(Salary<Q1,'Q-0.25',
If(Salary<Q2,'Q-0.50',
If(Salary<Q3,'Q-0.75',
'Q-1'
)
)
) AS Quartile_Salary
RESIDENT data;
DROP TABLE data;
Will be different if you want a calculated dimension (i.e. only done on the fly) though.
Cheers,
Chris.
Cheers mate!
That did the trick.
Thanks a lot for the help.