Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
QFanatic
Creator
Creator

Some coding / modelling assistance please

Hi clever people.

I have the following scenario:

1) A table with customers with the date they visited our store:

Customer Number Store Visit Date 
123 2022/01/20 09:06:49
123 2022/01/20 09:06:55
456 2022/01/20 14:01:22

 

2) A Table (for same customers),  containing a Timestamp when this customer bought one of our products (at this time, we dont know per which visit...we only have the TIME a product was bought)

Customer Number Product_bought_date Product Bought
123 2022/01/20 09:10:49 Apples
123 2022/01/20 09:12:55 Bananas
456 2022/01/20 15:03:22 Meat

 

Horrendous data - I know.

What I've been asked to do - The customer visits the store, and I need to know if he bought any products from us, within 1 hour of his visit.....(add 1 hour to the Store Visit date , and then see if the Product_bought_date is within that 1 hour bracket).

I dont want to join these two tables, as it will mess with the rest of my model, hence I was considering using applymap somehow - but I dont know how to put together this entire thing.

 

Much appreciated

 

Labels (1)
1 Solution

Accepted Solutions
RsQK
Creator II
Creator II

Hmm.  One way could be something like this. I assume that for a purchase to be considered valid, it must occur on the same day as the visit date. Basically i flag the records, after you've got the flag, you can decide whether to remove those records or whatever else. Here's the code:

 

Visits:
LOAD *,
TIMESTAMP([Store Visit Date] + 1/24) AS [Store Visit End];

LOAD
[Customer number],
TIMESTAMP(TIMESTAMP#([Store Visit Date],'YYYY/MM/DD hh:mm:ss')) AS [Store Visit Date]
INLINE [
Customer number, Store Visit Date
123, 2022/01/20 09:06:49
123, 2022/01/20 09:06:55
456, 2022/01/20 14:01:22
];

Purchases:
LOAD *,
DATE(FLOOR(Product_bought_date)) AS Product_bought_date_floored;

LOAD
[Customer number],
TIMESTAMP(TIMESTAMP#([Product_bought_date],'YYYY/MM/DD hh:mm:ss')) AS [Product_bought_date],
[Product Bought]
INLINE [
Customer number, Product_bought_date, Product Bought
123, 2022/01/20 09:10:49, Apples
123, 2022/01/20 09:12:55, Bananas
123, 2022/01/19 09:00:00, Old Fruit
456, 2022/01/20 15:03:22, Meat
456, 2022/01/20 13:03:22, Old Meat
];

//Check if customer buys within hour:
temp_check:
LOAD DISTINCT
[Customer number],
Product_bought_date
RESIDENT Purchases;

INTERVALMATCH([Product_bought_date],[Customer number])
LEFT JOIN (temp_check)
LOAD
[Store Visit Date],
[Store Visit End],
[Customer number]
RESIDENT Visits;

temp_check2:
LOAD DISTINCT
[Customer number],
[Store Visit Date]
RESIDENT temp_check
WHERE LEN(TRIM([Store Visit Date]))>0;

DROP TABLE temp_check;

temp_all_visits:
NoConcatenate
LOAD DISTINCT
[Customer number],
[Store Visit Date]
RESIDENT Visits;

LEFT JOIN (temp_all_visits)
LOAD
[Customer number],
[Store Visit Date],
1 AS purch_flag
RESIDENT temp_check2;

DROP TABLE temp_check2;

temp_all_visits2:
LOAD
[Customer number],
[Store Visit Date],
PICK(MATCH(RANGESUM(purch_flag,0),1)+1,'No','Yes') as [Has there been a purchase]
RESIDENT temp_all_visits;

DROP TABLE temp_all_visits;

LEFT JOIN (Visits)
LOAD * RESIDENT temp_all_visits2;

DROP TABLE temp_all_visits2;

DROP FIELD [Store Visit End];

//Check for old purchases that are irrelevant - my assumption is that the purchase must occur on visit day:
temp_visit_list:
LOAD DISTINCT
[Customer number],
[Store Visit Date],
DATE(FLOOR([Store Visit Date])) AS visit_date_floored
RESIDENT Visits;

temp_visit_list2:
LOAD
[Customer number],
visit_date_floored,
TIMESTAMP(MIN([Store Visit Date])) AS min_visit_timestamp
RESIDENT temp_visit_list
GROUP BY [Customer number],visit_date_floored;

DROP TABLE temp_visit_list;

LEFT JOIN (Purchases)
LOAD
[Customer number],
visit_date_floored AS Product_bought_date_floored,
min_visit_timestamp
RESIDENT temp_visit_list2;

DROP TABLE temp_visit_list2;

Purchases2:
LOAD *,
IF(LEN(TRIM(min_visit_timestamp))=0 OR Product_bought_date < min_visit_timestamp,1,0) AS [Old purchase flag]
RESIDENT Purchases;

DROP TABLE Purchases;
RENAME TABLE Purchases2 TO Purchases;

 

 

View solution in original post

11 Replies
justISO
Specialist
Specialist

Hi, I'm not sure where you want to solve your problem, in load script or report level, but in report level you can try something like this maybe:

if ([Store Visit Date]<=Product_bought_date and Product_bought_date<= timestamp([Store Visit Date] + 1/24), 'Yes', 'No')

QFanatic
Creator
Creator
Author

Thank you for your response.

 

It looks like, for your logic to work, the two tables will need to be joined? 

RsQK
Creator II
Creator II

Hey, this?

Visits:
LOAD *,
TIMESTAMP([Store Visit Date] + 1/24) AS [Store Visit End];

LOAD
[Customer number],
TIMESTAMP(TIMESTAMP#([Store Visit Date],'YYYY/MM/DD hh:mm:ss')) AS [Store Visit Date]
INLINE [
Customer number, Store Visit Date
123, 2022/01/20 09:06:49
123, 2022/01/20 09:06:55
456, 2022/01/20 14:01:22
];

Purchases:
LOAD
[Customer number],
TIMESTAMP(TIMESTAMP#([Product_bought_date],'YYYY/MM/DD hh:mm:ss')) AS [Product_bought_date],
[Product Bought]
INLINE [
Customer number, Product_bought_date, Product Bought
123, 2022/01/20 09:10:49, Apples
123, 2022/01/20 09:12:55, Bananas
456, 2022/01/20 15:03:22, Meat
];

//Check if customer buys within hour:
temp_check:
LOAD DISTINCT
[Customer number],
Product_bought_date
RESIDENT Purchases;

INTERVALMATCH([Product_bought_date],[Customer number])
LEFT JOIN (temp_check)
LOAD
[Store Visit Date],
[Store Visit End],
[Customer number]
RESIDENT Visits;

temp_check2:
LOAD DISTINCT
[Customer number],
[Store Visit Date]
RESIDENT temp_check
WHERE LEN(TRIM([Store Visit Date]))>0;

DROP TABLE temp_check;

temp_all_visits:
NoConcatenate
LOAD DISTINCT
[Customer number],
[Store Visit Date]
RESIDENT Visits;

LEFT JOIN (temp_all_visits)
LOAD
[Customer number],
[Store Visit Date],
1 AS purch_flag
RESIDENT temp_check2;

DROP TABLE temp_check2;

temp_all_visits2:
LOAD
[Customer number],
[Store Visit Date],
PICK(MATCH(RANGESUM(purch_flag,0),1)+1,'No','Yes') as [Has there been a purchase]
RESIDENT temp_all_visits;

DROP TABLE temp_all_visits;

LEFT JOIN (Visits)
LOAD * RESIDENT temp_all_visits2;

DROP TABLE temp_all_visits2;

DROP FIELD [Store Visit End];
justISO
Specialist
Specialist

I think, no need to join, if you will try to calculate the fact of purchase in Analysis mode (report level) as your tables already connected/linked by same field in both tables [Customer Number].

If you will try to calculate it in load script, than, for temporal table with calculation of fact of purchase without joining those tables it is hard to make. But you don't need to remove your both main tables, or join them together. Structure in this case I imagine:

justISO_0-1646826576496.png

 

QFanatic
Creator
Creator
Author

Thank you very much - I am testing as we speak.

Please explain the following line to me?

PICK(MATCH(RANGESUM(purch_flag,0),1)+1,'No','Yes') as [Has there been a purchase]

Thank you! 

QFanatic
Creator
Creator
Author

I am not sure I understand you - sorry 😞

are you saying in the front end I add expression - if ([Store Visit Date]<=Product_bought_date and Product_bought_date<= timestamp([Store Visit Date] + 1/24), 'Yes', 'No')?

I am a pain sorry - please can you paste the script here so I understand a bit better?

Thanks again

justISO
Specialist
Specialist

What I was suggesting was to create simple table in report level, with your dimensions (Customer Number, Store Visit Date and Product bought date) and one measure with formula I quickly wrote:

justISO_0-1646833544122.png

But this provide many-to-many representation and is not so correct to use, so I believe, @RsQK solution is much better.

QFanatic
Creator
Creator
Author

Hi RsQK,

Your solution is looking very good.

Only thing that I'd like to add - there are 'old' transactions in the Purchases table (older than the Store_Visit_Date)

How do I get rid of those?

Thanks!

RsQK
Creator II
Creator II

Hmm.  One way could be something like this. I assume that for a purchase to be considered valid, it must occur on the same day as the visit date. Basically i flag the records, after you've got the flag, you can decide whether to remove those records or whatever else. Here's the code:

 

Visits:
LOAD *,
TIMESTAMP([Store Visit Date] + 1/24) AS [Store Visit End];

LOAD
[Customer number],
TIMESTAMP(TIMESTAMP#([Store Visit Date],'YYYY/MM/DD hh:mm:ss')) AS [Store Visit Date]
INLINE [
Customer number, Store Visit Date
123, 2022/01/20 09:06:49
123, 2022/01/20 09:06:55
456, 2022/01/20 14:01:22
];

Purchases:
LOAD *,
DATE(FLOOR(Product_bought_date)) AS Product_bought_date_floored;

LOAD
[Customer number],
TIMESTAMP(TIMESTAMP#([Product_bought_date],'YYYY/MM/DD hh:mm:ss')) AS [Product_bought_date],
[Product Bought]
INLINE [
Customer number, Product_bought_date, Product Bought
123, 2022/01/20 09:10:49, Apples
123, 2022/01/20 09:12:55, Bananas
123, 2022/01/19 09:00:00, Old Fruit
456, 2022/01/20 15:03:22, Meat
456, 2022/01/20 13:03:22, Old Meat
];

//Check if customer buys within hour:
temp_check:
LOAD DISTINCT
[Customer number],
Product_bought_date
RESIDENT Purchases;

INTERVALMATCH([Product_bought_date],[Customer number])
LEFT JOIN (temp_check)
LOAD
[Store Visit Date],
[Store Visit End],
[Customer number]
RESIDENT Visits;

temp_check2:
LOAD DISTINCT
[Customer number],
[Store Visit Date]
RESIDENT temp_check
WHERE LEN(TRIM([Store Visit Date]))>0;

DROP TABLE temp_check;

temp_all_visits:
NoConcatenate
LOAD DISTINCT
[Customer number],
[Store Visit Date]
RESIDENT Visits;

LEFT JOIN (temp_all_visits)
LOAD
[Customer number],
[Store Visit Date],
1 AS purch_flag
RESIDENT temp_check2;

DROP TABLE temp_check2;

temp_all_visits2:
LOAD
[Customer number],
[Store Visit Date],
PICK(MATCH(RANGESUM(purch_flag,0),1)+1,'No','Yes') as [Has there been a purchase]
RESIDENT temp_all_visits;

DROP TABLE temp_all_visits;

LEFT JOIN (Visits)
LOAD * RESIDENT temp_all_visits2;

DROP TABLE temp_all_visits2;

DROP FIELD [Store Visit End];

//Check for old purchases that are irrelevant - my assumption is that the purchase must occur on visit day:
temp_visit_list:
LOAD DISTINCT
[Customer number],
[Store Visit Date],
DATE(FLOOR([Store Visit Date])) AS visit_date_floored
RESIDENT Visits;

temp_visit_list2:
LOAD
[Customer number],
visit_date_floored,
TIMESTAMP(MIN([Store Visit Date])) AS min_visit_timestamp
RESIDENT temp_visit_list
GROUP BY [Customer number],visit_date_floored;

DROP TABLE temp_visit_list;

LEFT JOIN (Purchases)
LOAD
[Customer number],
visit_date_floored AS Product_bought_date_floored,
min_visit_timestamp
RESIDENT temp_visit_list2;

DROP TABLE temp_visit_list2;

Purchases2:
LOAD *,
IF(LEN(TRIM(min_visit_timestamp))=0 OR Product_bought_date < min_visit_timestamp,1,0) AS [Old purchase flag]
RESIDENT Purchases;

DROP TABLE Purchases;
RENAME TABLE Purchases2 TO Purchases;