11 Replies Latest reply: Jun 6, 2014 3:37 PM by Robert Camilleri RSS

    Customer Retention Rates - Sorting within AGGR() (or any other solution appreciated)



      I have what seems to be a not-so-simple request.


      Data Model:


      For the purposes of this discussion, let us assume I have 3 dimensions:

           DIM_DATE where the key is %txn_date_sk    (for transaction date key). This is a full master calendar.

           DIM_CUSTOMER where the key is %customer_sk

           DIM_ABC where the key is %abc (the value of this is not important, but I want to show that i can have 2 rows for the same day and same customer).


      Then I have a Fact table which contains a number of measures:

      FACT_REVENUE which is linked to the above 3 dimensions (and a number of others). Let us say that FACT_REVENUE has two main measures:




      There cannot be a row where both Money_Put_In and Money_Taken_Out are > 0.




      Now the question I am trying to answer is:


      Out of the customers who have sum(Money_Put_In) > 0 in Month X - 1, how many have sum(Money_Put_In) > 0 in Month X. The result I'm looking for looks something like:


      Jan 2013 - 70%

      Feb 2013 - 75%

      Mar 2013 - 60 %



      So I cannot use set analysis, as that only gives me for the previous month "hardcoded" in the selection.


      I managed to get the data in the form of:


      customer_sk | Dim_Date.MonthYear | IsActiveInNextMonth

      1 | Jan-2013 | Yes

      1 | Feb-2013 | No

      2 | Jan-2013 | Yes



      I achieve this by putting in the SUM(Money_Put_In) in a straight table with customer_sk and Dim_Date.MonthYear, sorted by customer_sk and Dim_Date.MonthYear. Then I use the ABOVE function to make row-wise comparisons.


      My plan was to use this method in an AGGR(), but I cannot figure out how to explicitly state what the sorting needs to be within the Aggregation.


      Any help on this? Is there a better method which does not involve pre-calculating in a script?