Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi guys, in Oracle is quite easy to do a RFM Analysis using cum_dist() function as you can see in this example :
Deep Data Mining Blog: Recency, Frequency, Monetary (RFM) Analysis: Part 2
But I've found no way to replicate this example in Qlikview.
If I have the following table :
USER_ID | CAT | STATE_ID | CAMPAIGN_DATE | CAMPAIGN_ID | RESPONSE |
1001 | A | MA | 01-MAY-12 | A | N |
1001 | A | MA | 08-MAY-12 | A | N |
1001 | A | MA | 15-MAY-12 | A | Y |
1001 | A | MA | 22-MAY-12 | A | N |
1001 | A | MA | 29-may-12 | A | N |
1001 | A | MA | 06-JUN-12 | A | N |
1001 | B | CT | 06-JUN-12 | A | N |
1002 | B | CT | 01-may-12 | A | N |
1002 | B | CT | 08-may-12 | A | N |
1002 | B | CT | 15-may-12 | A | Y |
1002 | B | CT | 22-MAY-12 | A | N |
1002 | B | CT | 29-may-12 | A | Y |
How could I get the following result? :
USER_ID | RECENCY_DAYS | RECENCY_QUINTILE |
1001 | 151 | 3 |
1002 | 30 | 5 |
1003 | 72 | 2 |
Many thanks in advance!
Best regards, Marcel.
Hi Marcel,
I got a set of eyes on this, expect a response shortly.
Regards,
Mike Tarallo
Qlik
Hi Marcel,
Got to say I can't see how the output result in your example can be derived from the input! I might be being stupid but where did USER_ID = 1003 come from!? Also, Max(CAMPAIGN_DATE) WHERE RESPONSE = Y for 1001 and 1002 are only a few days apart so how can the RECENCY_DAYS be so far apart!?
Anyway - I think this can be done using QlikView's Rank() function. See the attached example.
(I have done the grouping in the script but you could use Aggr() to do it dynamically)
Hope this gets you on the right path,
Jason
Thanks Jason for the approach, the point is to "score" in equal quintiles a expression (in this case is recency, frequency and sales), the RFM Analysis, and it has to be dynamic.
Here you have an example of sales by customers, and my desired score :
Customer | Sales € | Score |
562 € | ||
c1 | 72 € | 5 |
c2 | 71 € | 5 |
c3 | 55 € | 5 |
c4 | 36 € | 4 |
c5 | 36 € | 4 |
c6 | 36 € | 4 |
c7 | 36 € | 3 |
c8 | 36 € | 3 |
c9 | 36 € | 3 |
c10 | 35 € | 2 |
c11 | 35 € | 2 |
c12 | 20 € | 2 |
c13 | 20 € | 1 |
c14 | 19 € | 1 |
c15 | 19 € | 1 |
I'm trying to get that score using rank(), fractile() and aggr() with no success by now.
Regards, Marcel.
Is this not it? Again - I'm not sure how you got your result exactly but my method gets close...
I'm not familiar with RFM Analysis but looked it up quickly. Your data below is one sale per customer so frequency is out and there is no date field so recency can't be worked out either can it?
Thanks Jason! It's a nice approach, now I have to take a look at this, and compare it with another developing RFM tools in excel. I'm afraid this is not the final answer. As I was seeing with another tools, in Oracle they solve it with cum_dist() function, and in excel, they solve it with percentile.inc() function, that's why I'm trying to get a perfect match in Qlikview.
In addition, you can see this is not the final solution when you select some customers, the result is not changing dinamically.
You deserve at least level 10 in the forum, thanks a lot for your help!
Regards, Marcel.
There'll be a way. I'll have another look over the weekend or maybe someone else will have a solution for you.
This is a two step calculation, and fairly straightforward in QlikView:
I would do the first step in the script, and the second in a chart.
HIC