Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
Our database has the following 3 tables (the data listed in the tables is only an example):
users:
ID | NAME | BALANCE |
REGISTERED |
1 | John Doe | 10 | 2022-11-15 06:38:36 |
2 | Mike Smith | 15 | 2022-12-16 07:42:12 |
payments: shows the payments made by users:
ID | USER_ID | AMOUNT | CREATED_AT |
1 | 2 | 20 | 2022-12-17 02:42:12 |
2 | 1 | 10 | 2022-12-18 02:42:12 |
costs: shows the costs of the users:
ID | USER_ID | COST | CREATED_AT |
1 | 2 | 4 | 2022-12-18 03:42:12 |
2 | 1 | 7 | 2022-12-19 03:42:12 |
We have the same 3 tables in our Qlik application (in our Data Model).
For each month we want to show Customer Retention Rate (CRR).
Formula for CRR is:
Customer Retention Rate formula = [(E-N)/S] x 100 S = Number of active users at the start of the time period E = Number of active users at the end of the time period N = Number of active users registered within the time period
where active users are those who have had a balance change during the time period. For example, for May 2023:
We tried to create an app/sheet/visualization that would display the information described above, but without success.
The dimension would be Year-Month (which we could somehow extract, maybe even from the Data load editor.
But the measure is the problem, it would probably be some very complicated expression. Do you have any ideas or suggestions on how we could solve this, or perhaps what the final expression used as a measure would look like?
I started working on this. It's not difficult, it's just a lot of flags and isn't really easy to explain in a forum.
- First, you have to have a Master Calendar where you flag your "date is greater than the 1st of the month" dates.
- Then, you have to break apart your User-Registration table into a table that only has Registrations as that is an activity we want to pick out
- Combine Costs and Payments to this Registrations table.
-Create a table that joins your Fact table (Registrations, Payments, Costs) to the MasterCal.
-Copy your event dates (registration date, payment date...and add 30 days to align them to the "Start of activity period" requirement.
Here is something I started on.
[Users-Registration]:
Load * Inline
[
ID, NAME, BALANCE, REGISTERED
1, John Doe, 10, 2022-11-15 06:38:36
2, Mike Smith, 15, 2022-12-16 07:42:12
]
;
[Payments]:
Load * Inline
[
ID, USER_ID, AMOUNT, CREATED_AT
1, 2, 20, 2022-12-17 02:42:12
2, 1, 10, 2022-12-18 02:42:12
]
;
[Costs]:
Load * Inline
[
ID, USER_ID, COST, CREATED_AT
1, 2, 4, 2022-12-18 03:42:12
2, 1, 7, 2022-12-19 03:42:12
]
;
/*
* Break out the users table into a single table
*
*/
[Users]:
Load ID As 'User Id',
ID As '%_user_id',
NAME
Resident [Users-Registration]
;
/*
* Create a Fact Table with cleaned up dates from the timestamps
*
*/
[Fact Table]:
Load ID As '%_user_id',
BALANCE,
'Y' As 'Is Registration',
Date(Floor(TimeStamp#(REGISTERED, 'YYYY-MM-DD h:mm:ss')), 'YYYY-MM-DD') As 'Fact Date'
Resident [Users-Registration]
;
Concatenate([Fact Table])
Load ID As 'Payment Id',
ID As '%_user_id',
AMOUNT,
'Y' As 'Is Payment',
Date(Floor(TimeStamp#(CREATED_AT, 'YYYY-MM-DD h:mm:ss')), 'YYYY-MM-DD') As 'Fact Date'
Resident [Payments]
;
Concatenate([Fact Table])
Load ID As 'Cost Id',
ID As '%_user_id',
COST,
'Y' As 'Is Costs',
Date(Floor(TimeStamp#(CREATED_AT, 'YYYY-MM-DD h:mm:ss')), 'YYYY-MM-DD') As 'Fact Date'
Resident [Costs]
;
/*
* Create a unique id for each row of the Fact Table
*
*/
[Fact Table 2]:
Load [Fact Id] As '%_fact_id',
*
;
Load RecNo() As 'Fact Id',
*
Resident [Fact Table]
;
Drop Table [Fact Table]
;
Rename Table [Fact Table 2] To [Fact Table]
;
/*
* Create the calendar join table
* flag our dates, and add 30 days so that they
* will be pushed to the 'Start of activity period'
*
*/
[Fact Calendar Join]:
Load %_fact_id,
DayStart(Floor(Num([Fact Date]))) AS '%fact_calendar_key',
'Y' As 'Standard Activity Period',
'N' As 'Start of Next Activity Period'
Resident [Fact Table]
;
Load %_fact_id,
DayStart(Floor(Num([Fact Date])))+30 AS '%fact_calendar_key',
'N' As 'Standard Activity Period',
'Y' As 'Start of Next Activity Period'
Resident [Fact Table]
;
/*
* Create a master calendar
* flagging days > 1 so we can use a flag for the 'S'
* in the CRR equation
*/
MasterCalendar:
LEFT KEEP([Fact Table])
LOAD
DayStart(Floor(Num(TempDate))) AS %fact_calendar_key,
If(Day( (TempDate) ) > 1, 'Y', 'N' ) As 'Start of Next Activity Period',
Date(DayStart(TempDate)) AS CalDate,
Dual(Month(TempDate) &' ' & Year(TempDate), MonthStart(TempDate)) As MonthYear
;
//=== Generate a temp table of dates ===
LOAD
DATE(mindate + IterNo()) AS TempDate,
mindate,
maxdate // Used in InYearToDate() above, but not kept
WHILE mindate + IterNo() <= maxdate;
//=== Get min/max dates from Field ===/
LOAD
MIN(FieldValue('%fact_calendar_key', recno()))-1 as mindate,
MAX(FieldValue('%fact_calendar_key', recno())) as maxdate
AUTOGENERATE FieldValueCount('%fact_calendar_key');
Drop Tables [Users-Registration], [Payments], [Costs]
;
Exit Script
;
I started working on this. It's not difficult, it's just a lot of flags and isn't really easy to explain in a forum.
- First, you have to have a Master Calendar where you flag your "date is greater than the 1st of the month" dates.
- Then, you have to break apart your User-Registration table into a table that only has Registrations as that is an activity we want to pick out
- Combine Costs and Payments to this Registrations table.
-Create a table that joins your Fact table (Registrations, Payments, Costs) to the MasterCal.
-Copy your event dates (registration date, payment date...and add 30 days to align them to the "Start of activity period" requirement.
Here is something I started on.
[Users-Registration]:
Load * Inline
[
ID, NAME, BALANCE, REGISTERED
1, John Doe, 10, 2022-11-15 06:38:36
2, Mike Smith, 15, 2022-12-16 07:42:12
]
;
[Payments]:
Load * Inline
[
ID, USER_ID, AMOUNT, CREATED_AT
1, 2, 20, 2022-12-17 02:42:12
2, 1, 10, 2022-12-18 02:42:12
]
;
[Costs]:
Load * Inline
[
ID, USER_ID, COST, CREATED_AT
1, 2, 4, 2022-12-18 03:42:12
2, 1, 7, 2022-12-19 03:42:12
]
;
/*
* Break out the users table into a single table
*
*/
[Users]:
Load ID As 'User Id',
ID As '%_user_id',
NAME
Resident [Users-Registration]
;
/*
* Create a Fact Table with cleaned up dates from the timestamps
*
*/
[Fact Table]:
Load ID As '%_user_id',
BALANCE,
'Y' As 'Is Registration',
Date(Floor(TimeStamp#(REGISTERED, 'YYYY-MM-DD h:mm:ss')), 'YYYY-MM-DD') As 'Fact Date'
Resident [Users-Registration]
;
Concatenate([Fact Table])
Load ID As 'Payment Id',
ID As '%_user_id',
AMOUNT,
'Y' As 'Is Payment',
Date(Floor(TimeStamp#(CREATED_AT, 'YYYY-MM-DD h:mm:ss')), 'YYYY-MM-DD') As 'Fact Date'
Resident [Payments]
;
Concatenate([Fact Table])
Load ID As 'Cost Id',
ID As '%_user_id',
COST,
'Y' As 'Is Costs',
Date(Floor(TimeStamp#(CREATED_AT, 'YYYY-MM-DD h:mm:ss')), 'YYYY-MM-DD') As 'Fact Date'
Resident [Costs]
;
/*
* Create a unique id for each row of the Fact Table
*
*/
[Fact Table 2]:
Load [Fact Id] As '%_fact_id',
*
;
Load RecNo() As 'Fact Id',
*
Resident [Fact Table]
;
Drop Table [Fact Table]
;
Rename Table [Fact Table 2] To [Fact Table]
;
/*
* Create the calendar join table
* flag our dates, and add 30 days so that they
* will be pushed to the 'Start of activity period'
*
*/
[Fact Calendar Join]:
Load %_fact_id,
DayStart(Floor(Num([Fact Date]))) AS '%fact_calendar_key',
'Y' As 'Standard Activity Period',
'N' As 'Start of Next Activity Period'
Resident [Fact Table]
;
Load %_fact_id,
DayStart(Floor(Num([Fact Date])))+30 AS '%fact_calendar_key',
'N' As 'Standard Activity Period',
'Y' As 'Start of Next Activity Period'
Resident [Fact Table]
;
/*
* Create a master calendar
* flagging days > 1 so we can use a flag for the 'S'
* in the CRR equation
*/
MasterCalendar:
LEFT KEEP([Fact Table])
LOAD
DayStart(Floor(Num(TempDate))) AS %fact_calendar_key,
If(Day( (TempDate) ) > 1, 'Y', 'N' ) As 'Start of Next Activity Period',
Date(DayStart(TempDate)) AS CalDate,
Dual(Month(TempDate) &' ' & Year(TempDate), MonthStart(TempDate)) As MonthYear
;
//=== Generate a temp table of dates ===
LOAD
DATE(mindate + IterNo()) AS TempDate,
mindate,
maxdate // Used in InYearToDate() above, but not kept
WHILE mindate + IterNo() <= maxdate;
//=== Get min/max dates from Field ===/
LOAD
MIN(FieldValue('%fact_calendar_key', recno()))-1 as mindate,
MAX(FieldValue('%fact_calendar_key', recno())) as maxdate
AUTOGENERATE FieldValueCount('%fact_calendar_key');
Drop Tables [Users-Registration], [Payments], [Costs]
;
Exit Script
;
Thanks @JustinDallas !
Hi @JustinDallas ,
I have a few questions:
1. In [Users] table there are these two columns:
```
ID As 'User Id',
ID As '%_user_id',
```
Why do we also have "%_user_id", why don't we just have "User Id" and use that everywhere instead of "%_user_id"?
(I'm not sure what "%" prefix does, I googled and found that it's supposed to hide that column from the end users of the APP?)
2. This is the Model View that we got at the end:
Is that how you thought it should turn out in the end? (please note that there are some small differences in the names of the columns, here, for the sake of simplicity, I presented that we have a "costs" table, but instead we have a "messages" table, which is essentially the same thing...)
If yes, I understand that "Fact Table" is basically a table of all activities that affected (changed) users balance (payments and costs).
Then I see that there's "MasterCalendar" table - I guess we created it in order to be able to easily calculate/determine which "activities" (from "Fact Table") belong to the specified month:
My guess is that the whole purpose of this table is to use "MonthYear" as a dimension on visualizations.
What I don't understand is this "Fact Calendar Join" table. It looks like this:
Should we use it somehow when creating visualizations for getting monthly CRRs using this formula (copied here just to be easier to find it):
Customer Retention Rate formula = [(E-N)/S] x 100 S = Number of active users at the start of the time period E = Number of active users at the end of the time period N = Number of active users registered within the time period
where active users are those who have had a balance change during the time period. For example, for May 2023:
Why do we also have "%_user_id", why don't we just have "User Id" and use that everywhere instead of "%_user_id"?
I have a strict design policy that key fields are always prefixed with %_ and only connect to two tables.
Is that how you thought it should turn out in the end?
No, the "Start of Activity" shouldn't exist on the Master Calendar table.
Should we use it somehow when creating visualizations for getting monthly CRRs using this formula (copied here just to be easier to find it):
Yes, in something like Count, you will have Count({<[Start of Previous Activity]={'N'}>}) [User Id]) to differentiate between the Fact Calendar Join types.