Identifying Repeat and Unique Customers

Nisha_Pra · ‎2020-08-06

Repeat Caller Logic within 48 hours. (Repeat caller duration is based on callstartdate and prevcallsegstartdate as per the example provided)

Formula: Numerator - Unique Contact , Denominator - Unique + Repeat Contact

CustomerPhone	CallID	Callstartdate	Callenddate	prevcallstartdate	repeatcalldurationSecs
123	Call1	2020-06-08 11:13:05.000	2020-06-08 11:28:48.000	NULL	NULL
123	Call2	2020-06-08 11:39:35.000	2020-06-08 11:50:23.000	2020-06-08 11:13:05.000	1590
123	Call3	2020-06-11 15:24:07.000	2020-06-11 15:33:56.000	2020-06-08 11:39:35.000	272672
126	Call12	2020-03-22 14:40:49.000	2020-03-22 15:09:42.000	NULL	NULL

From the above example -
In the case of 123 phone number (1-2 row number) - its a repeat contact.
This customer contacted twice within 48 hours based on the first two rows

Now in the numerator - it should be 0
in the denominator - it should be 2

3rd row - again the same customer contacted after 48 hours, it should treat as a new customer
Now numerator - 1 and denominator - 1

In the case of 126 customer phone number - its unique contact
Now numerator - 1
denominator - 1

Total KPI value - Numerator - 0+1+1
Denominator - 2+1+1

Can someone help me how to achieve this in QV scripting?

Thanks

Saravanan_Desingh · ‎2020-08-06

I am not sure why the first tow rows, Numerator should be 0. Actually the first row should be considered as a new Customer and return 1.

Saravanan_Desingh · ‎2020-08-06

Try this.

Let v48hAsMM=48*60;

tab1:
LOAD *, If(CustomerPhone<>Peek(CustomerPhone) Or IsNull(Peek(CustomerPhone)) Or CallDiff > 2,RangeSum(Peek(Key),1),Peek(Key)) As Key;
LOAD *, If(CustomerPhone=Peek(CustomerPhone), Interval(StDt-Peek(StDt))) As CallDiff;
LOAD *, Timestamp(Timestamp#(Callstartdate,'YYYY-MM-DD hh:mm:ss.fff')) As StDt;
LOAD RecNo() As RowID, * INLINE [
    CustomerPhone, CallID, Callstartdate, Callenddate, prevcallstartdate, repeatcalldurationSecs
    123, Call1, 2020-06-08 11:13:05.000, 2020-06-08 11:28:48.000, NULL, NULL
    123, Call2, 2020-06-08 11:39:35.000, 2020-06-08 11:50:23.000, 2020-06-08 11:13:05.000, 1590
    123, Call3, 2020-06-11 15:24:07.000, 2020-06-11 15:33:56.000, 2020-06-08 11:39:35.000, 272672
    126, Call12, 2020-03-22 14:40:49.000, 2020-03-22 15:09:42.000, NULL, NULL
];

Left Join(tab1)
LOAD Key, Count(Key) As Denominator
Resident tab1
Group By Key

Saravanan_Desingh · ‎2020-08-06

Expression:

=Count(DISTINCT Key)/Sum(Denominator)

Which is 3/6=0.5

Nisha_Pra · ‎2020-08-07

I agree.. Always the first row we have to consider it as a new customer. But as per the business requirement, for repeat contact, we are considering the first row as well. I am looking into your solution.

Thank you very much for your response.

Nisha_Pra · ‎2020-08-07

This is what exactly I am looking for. One thing I don't understand how you are segregating the data based on 48 hours in the logic you provided.

For example:

In case of Customer Phone Number - 123, In the key field for the third row, how it's showing as "2". This is how I wanted, If any repeat customer comes after 48 hours, should treat as a new customer. But not sure how it's working in the backend without giving the threshold as 48 hours. could you please explain?

My expected output is 2/4. Since the business doesn't want to consider the first row in the numerator in case of a repeat customer. To achieve this can we create a flag to identify repeat vs Unique which can be considered in the KPI. Please help me with this.

Thanks.

Brett_Bleess · ‎2020-08-19

I am going to provide a Design Blog link that should help explain the preceding load statements, but I cannot follow-up on the 48 hour piece, @Saravanan_Desingh will have to answer that part, will see if my flag will allow him to circle back on this one.

https://community.qlik.com/t5/Qlik-Design-Blog/Preceding-Load/ba-p/1469534

Regards,
Brett

To help users find verified answers, please do not forget to use the "Accept as Solution" button on any post(s) that helped you resolve your problem or question.
I now work a compressed schedule, Tuesday, Wednesday and Thursday, so those will be the days I will reply to any follow-up posts.

Kushal_Chawda · ‎2020-08-19

Let's say there is row 5 for 126.. which is contacted within 48 hrs..so Numerator will be 0 for row no 4?

Nisha_Pra · ‎2020-09-03

Hi @Kushal_Chawda , Yes correct. so numerator will be 0 for row no 4 since it will become repeat customer.

Kushal_Chawda · ‎2020-09-04

@Nisha_Pra try below

let v48_Hrs_check = '48:00:00';

data:
LOAD RecNo() As RowID, * INLINE [
    CustomerPhone, CallID, Callstartdate, Callenddate, prevcallstartdate, repeatcalldurationSecs
    123, Call1, 2020-06-08 11:13:05.000, 2020-06-08 11:28:48.000, NULL, NULL
    123, Call2, 2020-06-08 11:39:35.000, 2020-06-08 11:50:23.000, 2020-06-08 11:13:05.000, 1590
    123, Call3, 2020-06-11 15:24:07.000, 2020-06-11 15:33:56.000, 2020-06-08 11:39:35.000, 272672
    126, Call12, 2020-03-22 14:40:49.000, 2020-03-22 15:09:42.000, NULL, NULL
];

Left Join(data)
LOAD CustomerPhone,
     Count(CustomerPhone) as CustomerPhone_Cnt
Resident data
Group by CustomerPhone;

T1:
NoConcatenate
LOAD *,
     alt(if(CustomerPhone=Peek(CustomerPhone),if(interval(Callstartdate-Peek(Callstartdate),'hh:mm:ss')<='$(v48_Hrs_check)',1,null()),null()),1) as Denominator,
    if((CustomerPhone=Peek(CustomerPhone) and interval(Callstartdate-Peek(Callstartdate),'hh:mm:ss')>'$(v48_Hrs_check)') or
    CustomerPhone_Cnt=1,1,0) as Numerator
Resident data
Order by CustomerPhone,Callstartdate;

DROP Table data;

I feel like your denominator will always have 1 value for individual row as in either of the case for repeat or not repeat customer you are counting the rows

So probably you could use below in load script (check it out as not sure)

1 as Denominator

Now you can simply use sum(Numerator)/sum(Denominator) as KPI

Development

qv app

Script