Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Averages: Remove top and bottom 10%

I have the below script that calculates the averages of business days and calendar days. The problem, I have outlier values that are skewing results.

Can I:

Remove top 10% from the average?

Remove top 10% and bottom 10% from the average?

LOAD

D AS CONTACT_D,
Avg([FIRST_CONTACT_CAL_DAYS])AS Avg_FirstPhoneContact_Days,
Avg([FIRST_CONTACT_WK_DAYS])AS Avg_FIRST_CONTACT_WK_DAYS

FROM $(vPath_QVDs)MainTable.qvd(qvd)

Where 1=1
AND NOT IsNull(FIRST_CONTACT_CAL_DAYS)
AND NOT IsNull(FIRST_CONTACT_WK_DAYS)
AND START_DT > Addmonths(today(),-12)

GROUP BY
D

1 Solution

Accepted Solutions
sunny_talwar

Should be simple

Avg({<ASSIGNED_STATUS = {'NEWLY ASSIGNED MEMBER'}>} Aggr(

If(Only({<ASSIGNED_STATUS = {'NEWLY ASSIGNED MEMBER'}>} FIRST_CONTACT_CAL_DAYS) >=

Fractile({<ASSIGNED_STATUS = {'NEWLY ASSIGNED MEMBER'}>} TOTAL <PROGRAM> FIRST_CONTACT_CAL_DAYS, 0.10) AND

Only({<ASSIGNED_STATUS = {'NEWLY ASSIGNED MEMBER'}>} FIRST_CONTACT_CAL_DAYS) <=

Fractile({<ASSIGNED_STATUS = {'NEWLY ASSIGNED MEMBER'}>}TOTAL <PROGRAM> FIRST_CONTACT_CAL_DAYS, 0.90),

FIRST_CONTACT_CAL_DAYS), PROGRAM, id))

View solution in original post

13 Replies
Anonymous
Not applicable
Author

Perhaps create an indicator on my source data to capture top and bottom 10%

sunny_talwar

May be use Fractile functions to remove Top 10 and Bottom 10 from the overall observation and then use the remaining data to calculate your average?

Anonymous
Not applicable
Author

I have two example and neither are working. Can you help with the syntax?

//,AVG({<D = {"> $(=Fractile(FIRST_CONTACT_CAL_DAYS,0.20)) <= $(=Fractile(FIRST_CONTACT_CAL_DAYS,0.40))"}>} DISTINCT D)

//avg(if([FIRST_CONTACT_CAL_DAYS] > aggr(NODISTINCT fractile(FIRST_CONTACT_CAL_DAYS, 0.1), D ) and [FIRST_CONTACT_CAL_DAYS] < aggr(NODISTINCT fractile(FIRST_CONTACT_CAL_DAYS, 0.9), CLT_CLT_PGMID ),D))

sunny_talwar

Will be difficult without some data to work with... Also, is this something you want in the script or front end?

Anonymous
Not applicable
Author

I get some data

Anonymous
Not applicable
Author

See attached sample data

sunny_talwar

What is the number you expect to see here?

Anonymous
Not applicable
Author

I'm trying to capture the average column

FIRST_CONTACT_CAL_DAYS but want to exclude the top and bottom 10% to exclude the outliers.

sunny_talwar

Is 0 an outlier you would want to exclude? Because it might be one of the bottom 10%?