Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Cross-sell opportunities - market basket analysis. Try to find the probabilities of two unique items being sold at the same time period (maybe within same month-year, or just year) how can i achieve these
This can be achieved by using different techniques. Let me try a more Bayesian one.
These are some simple steps:
// Step 1: Data Preparation
// 1.1 Load Data: Ensure you have your sales data loaded into Qlik Sense. Your data should include at least these fields (or something like it)
- TransactionID
- ItemID
- Date
// 1.2. Create Date Fields: Depending on your analysis period (monthly, yearly), you can create additional fields. Look at the Community for Master Calendar for more details.
SalesData:
LOAD
TransactionID,
ItemID,
Date,
Year(Date) as Year,
Month(Date) as Month,
Year(Date) & '-' & Month(Date) as YearMonth
RESIDENT SalesData;
// Step 2: Data Aggregation
// 2.1. Cross Join Data: Create a cross join to get combinations of items sold together.
CrossJoin:
LOAD
TransactionID,
ItemID as ItemA
RESIDENT SalesData;
JOIN (CrossJoin)
LOAD
TransactionID,
ItemID as ItemB
RESIDENT SalesData
WHERE ItemID <> ItemA;
// 2.2. Calculate Frequencies: Aggregate the data to calculate the frequencies of combinations.
FrequencyTable:
LOAD
ItemA,
ItemB,
YearMonth,
COUNT(DISTINCT TransactionID) as Count
RESIDENT CrossJoin
GROUP BY ItemA, ItemB, YearMonth;
FrequencyTableYearly:
LOAD
ItemA,
ItemB,
Year,
COUNT(DISTINCT TransactionID) as Count
RESIDENT CrossJoin
GROUP BY ItemA, ItemB, Year;
// Step 3: Probability Calculation
// 3.1. Calculate Total Transactions: Calculate the total number of transactions for each item and time period.
TotalTransactions:
LOAD
ItemID,
YearMonth,
COUNT(DISTINCT TransactionID) as TotalCount
RESIDENT SalesData
GROUP BY ItemID, YearMonth;
TotalTransactionsYearly:
LOAD
ItemID,
Year,
COUNT(DISTINCT TransactionID) as TotalCount
RESIDENT SalesData
GROUP BY ItemID, Year;
// 3.2. Join Total Transactions: Join the total transaction data to your frequency table.
LEFT JOIN (FrequencyTable)
LOAD
ItemID as ItemA,
YearMonth,
TotalCount as TotalCountA
RESIDENT TotalTransactions;
LEFT JOIN (FrequencyTable)
LOAD
ItemID as ItemB,
YearMonth,
TotalCount as TotalCountB
RESIDENT TotalTransactions;
LEFT JOIN (FrequencyTableYearly)
LOAD
ItemID as ItemA,
Year,
TotalCount as TotalCountA
RESIDENT TotalTransactionsYearly;
LEFT JOIN (FrequencyTableYearly)
LOAD
ItemID as ItemB,
Year,
TotalCount as TotalCountB
RESIDENT TotalTransactionsYearly;
// 3.3. Calculate Probability: Add a field to calculate the probability of two items being sold together.
FrequencyTable:
LOAD
*,
(Count / (TotalCountA + TotalCountB - Count)) as Probability
RESIDENT FrequencyTable;
FrequencyTableYearly:
LOAD
*,
(Count / (TotalCountA + TotalCountB - Count)) as Probability
RESIDENT FrequencyTableYearly;
// Step 4: Visualization in Qlik Sense
4.1. Create a Pivot Table: Add a pivot table to your Qlik Sense sheet.
4.2. Add Dimensions: Use `ItemA` and `ItemB` as dimensions.
4.3. Add Measures: Use `Probability` as the measure to show the probability of items being sold together.
// Step 5: Filter by Time Period
Add filters for `Year` and `Month` to allow dynamic analysis based on different time periods.
This approach should work for a lot of cases. I have not tested this specific script and since I wrote from heart, you might need to adjust here and there, but it gives you the idea of how to perform basic Bayesian theorem in Qlik.