Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
slacayo
Contributor III
Contributor III

Create a Column with "Join" Date

I have a table:

LOAD * INLINE [
 Onum, CID,    Date  
516, 1, 2020-10-05
557, 2, 2021-04-09
667, 3, 2020-11-15
788, 3, 2021-02-08
556, 2, 2020-10-31
556, 3, 2021-08-09
333, 1, 2021-09-09
444, 1, 2020-10-01
];

 I am attempting to do a cohort analysis, so I am trying to create a column with the cohort data (Year-Month they first purchased ), and rebills for every `CID`. 

For the cohort, the logic would be something like

 

 

Psuedo-code

cohort_column = table.groupby('CID')['Date'] \
                .transform('min') \
                .to_period(Y-M) 

 

 

I want to add two additional columns: One with the cohort date and one with gives the purchase count with respect to the date. So my table should look like:

LOAD * INLINE [
Onum, CID, Date, cohort, count
516, 1, 2020-10-05, 2020-10, 2
333, 1, 2021-09-09, 2020-10, 3
444, 1, 2020-10-01, 2020-10, 1
557, 2, 2021-04-09, 2020-09, 2
556, 2, 2020-09-31, 2020-09, 1
667, 3, 2020-11-15, 2020-11, 1
788, 3, 2021-02-08, 2020-11, 2
556, 3, 2021-08-09, 2020-11, 3 ];

 

 

3 Replies
maxgro
MVP
MVP

If I understand your question, try with this script


t:
LOAD Onum, CID, Date(Date#(Date2, 'YYYY-MM-DD')) as Date2 INLINE [
Onum, CID, Date2, cohort, count
516, 1, 2020-10-05, 2020-10, 2
333, 1, 2021-09-09, 2020-10, 3
444, 1, 2020-10-01, 2020-10, 1
557, 2, 2021-04-09, 2020-09, 2
556, 2, 2020-09-30, 2020-09, 1
667, 3, 2020-11-15, 2020-11, 1
788, 3, 2021-02-08, 2020-11, 2
556, 3, 2021-08-09, 2020-11, 3
];

// calc cohort with a group by
LEFT JOIN (t)
LOAD
CID,
Date(min(Date2), 'YYYY-MM') as cohort
Resident t
GROUP BY CID;

 

// calc count with a resident load order by

t3:
LOAD
*,
IF(cohort = Previous(cohort), peek('count')+1, 1) as count
Resident t
ORDER BY cohort, Date2;

DROP Table t;

 

 

 

maxgro_0-1639774388024.png

 

slacayo
Contributor III
Contributor III
Author

Thanks a lot for the response @maxgro — I will attempt to implement your solution and either follow up with you/mark it as a working solution sometime today. 

slacayo
Contributor III
Contributor III
Author

Hi @maxgro , the first part of your solution works well! 

However, the code to calculate the counts might not be the solution I am looking for. I believe it's iterating as long as the previous cohort value is the same. However, in some instances users `CSID` have the same cohort so it may be attributing counts to other people. Is there anyway to change the criteria where your code applies only if its the same 'CSID'? Thanks.