Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I've got the following sample data set:
Client, Product, Sales
1, Apple, 100
1, Mango, 200
2, Apple, 100
2, Carrot, 500
3, Mango, 300
4, Carrot, 200
5, Carrot, 100
5, Apple, 300
Description:
I try to calculate the Support Value. Support is the number of transactions that include items in the {A} and {B} parts of the rule as a percentage of the total number of transactions:
Support = (A and B) / Total = ("Apple" and "Carrot") / 5 = 2 / 5 = 0.4 (40%)
The following expression gets me the correct value in a text box:
=Count(DISTINCT {<Client = p({<Product = {'Apple'}>})*p({<Product = {'Carrot'}>})>} Client)/Count(DISTINCT Client)
Question:
On practice it could be much more transactions so I'd like to get some table as an output for top N combinations (by support) and play around with it using selections. Is it possible to do it? (don't want to set the products' names manually to each pair)
Expected output:
No, Combination, Support
1, "Apple" and "Carrot", 0.4
2, "Apple" and "Mango", 0.2
3, "Carrot" and "Mango", 0.0
Regards,
Evgeniy
Hi,
here's a first step :
with calculation done in script; 2 new tables :
item:
load Client, Product resident data;
inner join (item) load Client, Product as p resident item;
items:
load Client,Product & p as supitem resident item where Product <p ;
drop table item;
supportitem:
load supitem , count(Client) as compte resident items group by supitem;
you can select combination in supitem, that selects Client in your data set
regards
Hi,
here's a first step :
with calculation done in script; 2 new tables :
item:
load Client, Product resident data;
inner join (item) load Client, Product as p resident item;
items:
load Client,Product & p as supitem resident item where Product <p ;
drop table item;
supportitem:
load supitem , count(Client) as compte resident items group by supitem;
you can select combination in supitem, that selects Client in your data set
regards
Hi Olivier,
It seems awesome! Thank you so much! Especially thanks for the details!
And one more question by the way (even without your last step):
Imagine a business question: Which were combinations "affecting" on Friday?
Data:
Client, Product, Sales, WeekDay
1, Apple, 100, Thu
1, Mango, 200, Thu
2, Apple, 100, Thu
2, Carrot, 500, Thu
3, Mango, 300, Thu
4, Carrot, 200, Thu
5, Carrot, 100, Fri
5, Apple, 300, Fri
6, Carrot, 100, Fri
6, Apple, 300, Fri
6, Mango, 300, Fri
Would it be a correct solution based on your example? ( screenshot is attached)
Regards,
Evgeniy
take care Evgeniy,
the cross join is made without considering the day,
it works with you example, but doesn't work if same client have different days purchase,
so in script weekday should be add in group by
regards