Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I'm looking for a way to find per client in my database the interval in days between buying something & their last digital interaction we had with them before they bought.
This is an example of how my raw datafile looks like (also attached :
Date | Activity | Client | |
1/01/17 | Item bought | Client A | |
10/12/16 | Item bought | Client A | |
18/11/16 | Item bought | Client A | |
27/10/16 | Item bought | Client A | |
5/10/16 | Item bought | Client A | |
13/09/16 | Item bought | Client A | |
27/10/16 | Website activity | Client A | |
5/10/16 | Website activity | Client A | |
13/08/16 | Rep visit | Client A | <- this is the last activity before the first buy. Days between last activity and item bought: 31 days |
1/01/17 | Item bought | Client B | |
10/12/16 | Item bought | Client B | |
18/11/16 | Item bought | Client B | |
27/10/16 | Item bought | Client B | |
5/10/16 | Item bought | Client B | |
13/09/16 | Item bought | Client B | |
2/01/17 | Website activity | Client B | |
5/10/15 | Website activity | Client B | <- this is the last activity before the first buy. Days between last activity and item bought: 344 days |
13/09/15 | Website activity | Client B | |
1/01/17 | Item bought | Client C | |
10/12/16 | Item bought | Client C | |
18/11/16 | Item bought | Client C | |
27/10/16 | Item bought | Client C | |
5/10/16 | Item bought | Client C | |
13/09/16 | Item bought | Client C | |
27/12/15 | Website activity | Client C | |
5/10/16 | Website activity | Client C | |
12/09/16 | Email opened | Client C | <- this is the last activity before the first buy. Days between last activity and item bought: 1 day |
The desired outcome is a table like this:
Client | # days between last activity & first purchase |
Client A | 31 |
Client B | 344 |
Client C | 1 |
I'm in the dark on how to do this properly...
Any help would be greatly appreciated!
Kind regards,
Christophe
Thanks David !!
This worked for me. I adjusted the script like this to get the outcome I wanted:
Master:
LOAD Date,
Activity,
Client
FROM
[clientactivities.xlsx]
(ooxml, embedded labels, table is Blad1);
SamplePreload:
LOAD
RowNo() as RowID,
Client,
Date,
Activity,
If(Client = Previous(Client) AND Activity = 'Item bought' AND Previous(Activity) <> 'Item bought',
Date-Previous(Date), Null()) AS DaysSinceLastActivity
RESIDENT Master
ORDER BY Client, Date;
Sample:
LOAD FirstSortedValue(DaysSinceLastActivity,RowID) as DaysSinceLastActivity,
Client
Resident SamplePreload
Where len(DaysSinceLastActivity) > 0
Group By Client;
DROP TABLE Master, SamplePreload;
This returned me:
Client | DaysSinceLastActivity |
Client A | 31 |
Client B | 344 |
Client C | 1 |
Try like this :
Master:
LOAD
Date,
Activity,
Client
from table ;
LOAD
Client,
max(Date)-min(Date) as Number_Of_Days
resident
Master
group by
Client:
Hi Avinash,
Thanks for your quick reply, but here is some extra complexity in the data I shared. I'm not looking for the days between the very first and very last activity per client. I'm looking for the # days between the last activity that happened before the first sale & the actual first sale.
Kind regards,
Christophe
How to identify that "the last activity that happened before the first sale"
Sorry, let me clarify. There are 2 main types of activities in my dataset:
1) Item bought
2) All other activities
In order to identify the 2 dates to substract from each other, I need to know:
1) What was the first date when an item was bought
2) What was the date of the activity that preceded that buy. So if I I buy something on Dec 31st 2016, and I did 3 other things before that date (like visiting the website) on the 30th, the 21st & the 18th of that same month, I'm only interested in the 30th, as that's the last activity I did before buying something. The other 2 days are to be ignored.
So in this very simple table:
01/01/2017 <- second item bought
31/12/2016 <- first item bought
30/12/2016 <- last activity before buying first item
21/12/2016 <- second to last activity before buying first item
18/12/2016 <- third to last activity before buying first item
Only the dates in bold are of my interest. The number I'm looking for would then be "1" day in this case.
Anyone has a creative idea on how to approach this?
Check this? Based on your excel sheet this is what i got. Let me know if i have missed anything.
Hi,
You can use something like the script below, but you need to fine tune it because this will give you the difference in days for every sale preceded by a non-sale activity within a client bucket:
//Initial load _temp_Sample: LOAD Date, Activity, Client, E as Comment FROM clientactivities.xlsx (ooxml, embedded labels, table is Blad1); //Sorted Sample: LOAD Client, Date, Activity, Comment ,If(Client = Previous(Client) AND Activity = 'Item bought' AND Previous(Activity) <> 'Item bought', Date-Previous(Date), Null()) AS DaysSinceLastActivity RESIDENT _temp_Sample ORDER BY Client, Date ; DROP TABLE _temp_Sample;
Regards,
David
Hi Vishwarath,
Thanks for this. The one thing it's missing is that it should ignore any other activity / sales data after the first sale.
For client C, it returns 261 days while it should have been 1 day:
Client C did 2 activities before his very first buy. The activity relevant for me is the one which took place on September 12th (the email open). The first sale on Sept 13th is indeed the one to take into account.
So basically, in the table I should have:
Client | Days Difference | ItemBoughtDate | LastActivityDate |
---|---|---|---|
Client C | 1 | 9/13/2016 | 9/12/2016 |
Thanks a lot already for your help provided!
Kind regards,
Christophe
Thanks David !!
This worked for me. I adjusted the script like this to get the outcome I wanted:
Master:
LOAD Date,
Activity,
Client
FROM
[clientactivities.xlsx]
(ooxml, embedded labels, table is Blad1);
SamplePreload:
LOAD
RowNo() as RowID,
Client,
Date,
Activity,
If(Client = Previous(Client) AND Activity = 'Item bought' AND Previous(Activity) <> 'Item bought',
Date-Previous(Date), Null()) AS DaysSinceLastActivity
RESIDENT Master
ORDER BY Client, Date;
Sample:
LOAD FirstSortedValue(DaysSinceLastActivity,RowID) as DaysSinceLastActivity,
Client
Resident SamplePreload
Where len(DaysSinceLastActivity) > 0
Group By Client;
DROP TABLE Master, SamplePreload;
This returned me:
Client | DaysSinceLastActivity |
Client A | 31 |
Client B | 344 |
Client C | 1 |