Test:
Load*Inline [
Month,user_id
jan,1
jan,2
jan,3
jan,4
jan,5
jan,6
jan,7
jan,8
jan,9
feb,1
feb,5
feb,10
feb,11
feb,12
March,1
March,2
March,3
March,13
March,15];
exit script;
This is my data set.
Now my requirement is that i want to have a ratio which should be equal to no of user_id repeated in march from jan but they should not be included in feb.
for example there are 9 user_id in jan and out of which 2 are repeating in feb i.e, 1 and 5 so we will remove that from the base of jan and it comes out to be 7 user_id now we will move to march and we can see that march and jan has 3 common user_id which are 1,2 and 3 but 1 has already been removed while calculating feb user_id so now 2 and 3 remains which are common between jan and feb so now our ration should be common/base which is 2/9.
In short i want a metric named as M-2 which means my denominator will jan and numerator will be common user_id between jan and march but it should not include the user_id which already has been removed in feb month as mentioned in above example.
I tried this in M-1 scenario and it works in that
formula i used is
=sum(aggr(count({<Month={'jan'}>}distinct user_id)
*
count({<Month={'feb'}>}distinct user_id),user_id))
Please help in the second metric which M-2 mentioned in starting.
Thanks,
Rajat