Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have what seems to be a not-so-simple request.
Data Model:
For the purposes of this discussion, let us assume I have 3 dimensions:
DIM_DATE where the key is %txn_date_sk (for transaction date key). This is a full master calendar.
DIM_CUSTOMER where the key is %customer_sk
DIM_ABC where the key is %abc (the value of this is not important, but I want to show that i can have 2 rows for the same day and same customer).
Then I have a Fact table which contains a number of measures:
FACT_REVENUE which is linked to the above 3 dimensions (and a number of others). Let us say that FACT_REVENUE has two main measures:
Money_Put_In
Money_Taken_Out.
There cannot be a row where both Money_Put_In and Money_Taken_Out are > 0.
Problem
Now the question I am trying to answer is:
Out of the customers who have sum(Money_Put_In) > 0 in Month X - 1, how many have sum(Money_Put_In) > 0 in Month X. The result I'm looking for looks something like:
Jan 2013 - 70%
Feb 2013 - 75%
Mar 2013 - 60 %
etc.
So I cannot use set analysis, as that only gives me for the previous month "hardcoded" in the selection.
I managed to get the data in the form of:
customer_sk | Dim_Date.MonthYear | IsActiveInNextMonth
1 | Jan-2013 | Yes
1 | Feb-2013 | No
2 | Jan-2013 | Yes
...
I achieve this by putting in the SUM(Money_Put_In) in a straight table with customer_sk and Dim_Date.MonthYear, sorted by customer_sk and Dim_Date.MonthYear. Then I use the ABOVE function to make row-wise comparisons.
My plan was to use this method in an AGGR(), but I cannot figure out how to explicitly state what the sorting needs to be within the Aggregation.
Any help on this? Is there a better method which does not involve pre-calculating in a script?
Thanks,
Robert
See file:
- Added AsOf table
- Added dimension and expression in "Retention Rate" chart
- Added some more records to illustrate the fact that even if we have the same number of partners each month, we calculate correctly the retention rate.
Lucian
So I cannot use set analysis, as that only gives me for the previous month "hardcoded" in the selection.
You can do it using set analysis. Read this: Calculating rolling n-period totals, averages or other aggregations
I would go with AsOf table.
thanks for your reply. I'm not sure I'm figuring out how to use the AsOf table in this case however.
Let me try to express the solution in SQL maybe it is easier to understand the problem.
Select MonthXData.player_sk, MonthXData.month_name,
case when NextMonthData.month_name is not null then 'ACTIVE NEXT MONTH'
when NextMonthData.month_name is null then 'NOT ACTIVE NEXT MONTH'
from
(Select distinct month_name, month_number, player_sk
from fact_revenue join dim_date using (txn_date_sk)
where money_put_in > 0)
as MonthXData
left join
(Select distinct month_name, month_number, player_sk
from fact_revenue join dim_date using (txn_date_sk)
where money_put_in > 0)
as NextMonthData
on
MonthXData.player_sk = NextMonthData.player_sk
AND MonthXData.month_number = NextMonthData.month_number -1
How do you think I need to set up my AsOf table to achieve this?
Let's say you have your master calendar table "Time" and we want to link to "YearMonth" field:
AsOf:
LOAD DISTINCT
YearMonth as AsOfYearMonth
RESIDENT Time
;
JOIN (AsOf)
LOAD *,
if(MonthsBack=0,'Current', 'Previous') as MonthType
;
LOAD
AsOfYearMonth,
AddMonths(AsOfYearMonth, 1-iterno()) as YearMonth,
iterno()-1 as MonthsBack
RESIDENT AsOf
WHILE iterno() <=2
Still not figuring it out completely but working on it.
Will let you know how this works out.
Add a qvw example with demo data and expected result.
I managed to get the ASOF date sorted and correctly (I think) linked to my Calendar table. However I'm not able to get the correct expression to do the count I need.
I tried to use P() set analysis to get those players who were active in the PREVIOUS month and in the CURRENT month. Results don't look good though.
I will try to create a little QVW that will illustrate the issue.
Here's the QVW.
See file:
- Added AsOf table
- Added dimension and expression in "Retention Rate" chart
- Added some more records to illustrate the fact that even if we have the same number of partners each month, we calculate correctly the retention rate.
Lucian
Hi Lucian, thanks a mil for that. No wonder I could not get to the answer on my own .
The result looks good, although I am still not understanding that expression and calculated dimension entirely yet. But I'll figure it out.
Thanks a lot for your help.