11 Replies Latest reply: Jun 6, 2014 3:37 PM by Robert Camilleri

Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

Hi,

I have what seems to be a not-so-simple request.

Data Model:

For the purposes of this discussion, let us assume I have 3 dimensions:

DIM_DATE where the key is %txn_date_sk    (for transaction date key). This is a full master calendar.

DIM_CUSTOMER where the key is %customer_sk

DIM_ABC where the key is %abc (the value of this is not important, but I want to show that i can have 2 rows for the same day and same customer).

Then I have a Fact table which contains a number of measures:

FACT_REVENUE which is linked to the above 3 dimensions (and a number of others). Let us say that FACT_REVENUE has two main measures:

Money_Put_In

Money_Taken_Out.

There cannot be a row where both Money_Put_In and Money_Taken_Out are > 0.

Problem

Now the question I am trying to answer is:

Out of the customers who have sum(Money_Put_In) > 0 in Month X - 1, how many have sum(Money_Put_In) > 0 in Month X. The result I'm looking for looks something like:

Jan 2013 - 70%

Feb 2013 - 75%

Mar 2013 - 60 %

etc.

So I cannot use set analysis, as that only gives me for the previous month "hardcoded" in the selection.

I managed to get the data in the form of:

customer_sk | Dim_Date.MonthYear | IsActiveInNextMonth

1 | Jan-2013 | Yes

1 | Feb-2013 | No

2 | Jan-2013 | Yes

...

I achieve this by putting in the SUM(Money_Put_In) in a straight table with customer_sk and Dim_Date.MonthYear, sorted by customer_sk and Dim_Date.MonthYear. Then I use the ABOVE function to make row-wise comparisons.

My plan was to use this method in an AGGR(), but I cannot figure out how to explicitly state what the sorting needs to be within the Aggregation.

Any help on this? Is there a better method which does not involve pre-calculating in a script?

Thanks,

Robert

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)
So I cannot use set analysis, as that only gives me for the previous month "hardcoded" in the selection.

You can do it using set analysis. Read this: Calculating rolling n-period totals, averages or other aggregations

I would go with AsOf table.

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

thanks for your reply. I'm not sure I'm figuring out how to use the AsOf table in this case however.

Let me try to express the solution in SQL maybe it is easier to understand the problem.

Select MonthXData.player_sk, MonthXData.month_name,

case when NextMonthData.month_name is not null then 'ACTIVE NEXT MONTH'

when NextMonthData.month_name is null then 'NOT ACTIVE NEXT MONTH'

from

(Select distinct month_name, month_number,  player_sk

from fact_revenue join dim_date using (txn_date_sk)

where money_put_in > 0)

as MonthXData

left join

(Select distinct month_name, month_number,  player_sk

from fact_revenue join dim_date using (txn_date_sk)

where money_put_in > 0)

as NextMonthData

on

MonthXData.player_sk = NextMonthData.player_sk

AND MonthXData.month_number = NextMonthData.month_number -1

How do you think I need to set up my AsOf table to achieve this?

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

Let's say you have your master calendar table "Time" and we want to link to "YearMonth" field:

AsOf:

YearMonth as AsOfYearMonth

RESIDENT Time

;

JOIN (AsOf)

if(MonthsBack=0,'Current', 'Previous') as MonthType

;

AsOfYearMonth,

iterno()-1 as MonthsBack

RESIDENT AsOf

WHILE iterno() <=2

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

Still not figuring it out completely but working on it.

Will let you know how this works out.

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

Add a qvw example with demo data and expected result.

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

I managed to get the ASOF date sorted and correctly (I think) linked to my Calendar table. However I'm not able to get the correct expression to do the count I need.

I tried to use P() set analysis to get those players who were active in the PREVIOUS month and in the CURRENT month. Results don't look good though.

I will try to create a little QVW that will illustrate the issue.

• Re: Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

Here's the QVW.

• Re: Re: Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

See file:

- Added dimension and expression in "Retention Rate" chart

- Added some more records to illustrate the fact that even if we have the same number of partners each month, we calculate correctly the retention rate.

Lucian

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

Hi Lucian, thanks a mil for that. No wonder I could not get to the answer on my own .

The result looks good, although I am still not understanding that expression and calculated dimension entirely yet. But I'll figure it out.

Thanks a lot for your help.

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

In expression:

1) you group DISTINCT Customer&Month, by current and previous months

2) if only() is null, then you have the customer in both months. Obs: isnull(only()) returns true (-1) , therefore we have a - (minus) in front of sum().

3) divide it by customers in previous month

• Re: Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)

Thanks for the explanation. I have a fun weekend implementing this on my model now