Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
In the table below the number of purchases is calculated based on a hardcoded date.
What i need to accomplish is to calculate it based on the Start and End date for each combination of Billing period and Payment cycle. So the calculation of each value of Purchases will be based on different dates.
Using the formulas for Start and End dates in the set analysis does not work.
Pre-aggregated table is not an option due to the infinite combinations of payment cycles and billing periods and the volume of data.
Any suggestions on how to accomplish that?
How does the underlying table look in the data model?
Below is a sample of the relevant data. There are multiple additional dimensions.
event_date | subscription_id | payment_cycle | billing_period | metric | value |
1/2/2022 | 274627645 | 1 | yearly | direct_purchase | 1 |
1/2/2023 | 274627645 | 2 | yearly | renewal | 1 |
1/2/2024 | 274627645 | 3 | yearly | renewal | 1 |
5/7/2023 | 982473564 | 0 | monthly | started_trial | 1 |
8/7/2023 | 982473564 | 1 | monthly | after_trial_purchase | 1 |
8/8/2023 | 982473564 | 2 | monthly | renewal | 1 |
8/9/2023 | 982473564 | 3 | monthly | renewal | 1 |
8/10/2023 | 982473564 | 4 | monthly | renewal | 1 |
so what is the expected output from this data?
The renewals were made during the period selected in the calendar and split by billing period and payment cycle. I am trying to calculate from how many new purchases they were made.
Example:
Calendar period: 10.01.2024 - 20.01.2024
If we have 5 yearly renewals with payment cycle 3 in this period, I need to find out how many new yearly purchases were made 2 (payment cycle -1) yearly periods back (10.01.2022-20.01.2022).
In the same column, for monthly new purchases i will have to calculate the new purchases for 2 monthly periods back (10.11.2023-20.11.2023).
I have managed to get the correct periods for each value (Start and End date) but cannot find a way to use them in the set analysis. I have tried saving their formulas in variables but since their values are different for each cell of the table, the formula doesn't work.
A pre-aggregating in the data-model may not the most suitable way but it doesn't mean that you couldn't pre-calculate some essential results, for example by creating one or several flag-fields.
This might be done with interrecord-functions within sorted resident-loads like:
load *, if(previous(id) = id, 0, 1) as FirstFlag
resident X order by id, date;
and descending executed the LastFlag could be created and instead of 0/1 also an accumulation might be applied. Alternatives may a self-joining aggregation-load with min/max/count and an afterwards-check.
Also thinkable are approaches with exists() and/or offset-calculations of the dates to today() and various similar measures.
The aim behind the above hints is to unify the different dates and numbers of occurrences per id to global valide values. In the simplest way it's a 0/1 flag which is always the same for each id and could be therefore queried in a set analysis.