Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Check if two certain products are sold together ?

Hi,

i got a sales table that looks like

TransactionIDProductIDQuantityValue
12-12341001224.99
12-1234100219.90
12-12551001113
12-13441002219.80

I have to build a "connection rate" report to show how often certain predefined products are sold together, for example count the transactions where products 1001 and 1002 are sold within the same transactionID.

In addition there is not only one connection rate defined but more than 10, so a hardcoded expression is not going to do it.

The bad thing is, I dont really know how and where to start.

Perhaps some of you already had a similar challenge and could give me some hints whether or not such an analysis is possible within qlikview

Thanks in advance,

Max

1 Solution

Accepted Solutions
Not applicable
Author

Hi,

I am not sure if I understood your question well but I gave it a try. Pls. see the attached file.

Best regards,

http://quickdevtips.blogspot.com/

View solution in original post

12 Replies
swuehl
MVP
MVP

If I understood your requirements corretly, you could start with a datamodel that creates a duplicate productID field and link this field to your table via transaction id:

Table1:

LOAD TransactionID,

     ProductID,

     Quantity,

     Value

FROM

[http://community.qlik.com/thread/53995?tstart=0]

(html, codepage is 1252, embedded labels, table is @1);

Table2:

LOAD TransactionID, ProductID as ProductID2 resident Table1;

Then just create a grid chart with dimensions ProductID and ProductID2 (second could be re-labeled to ProductID if you want), and as expression

=if(ProductID<>ProductID2,count(TransactionID))

Hope this helps,

Stefan

Not applicable
Author

Hi swuehl,

thank you for your reply and sorry for not answering in time but i didnt have the access til now

Your solution would work if you want to analyse which products sell together frequently but not if i have predefined market baskets and count how often those specific combinations exist.

I created a new table which looks like:

Basket, ProductId, Criteria1, Criteria2

BasketNameProductIDCriteria1Criteria2
A10011
A10021
A1007
1
B10011
B1004
1
C10051
C1009
1

and created a pivot chart with Shop on x axis and BasketName on y

I tried to get the number of transaction using set analysis and it worked nicely when just counting how often Criteria1 transactions exist with

=count($<Criteria1={1}>} distinct TransactionId)

But i got a unsolvable problem with getting the transactions where criteria1 AND criteria2 are sold. I tried it with

=count($<TransactionID=P({$<Criteria1={1}>}TransactionID) , Criteria2={1}>} distinct TransactionId)

but the problem is that the P() syntax surprisingly seems to ignore the dimensions and so I get transactions where Basket "A" criteria1 is sold together with Basket "C" criteria2, too

Is there a way to select/count transactions which got at least 1 product from criteria1 and one from criteria2 within the same basket

thx,

Max

swuehl
MVP
MVP

Ok, you want to know something like 'Tell me all Transactions where at least one ProductID is either 1001 or 1002 and another ProductID is 1007', is this right?

Do you have for every defined ProductID in your Basket table a transaction in your Transaction table?

I noticed that e.g. ProductID 1001 is occurring in Basket A as well as in Basket B. In this example ProductID 1001 is assigned to Criteria1. Is it possible that ProductID 1001 could also be assigned to Criteria 2 in another Basket?

It would be nice if you could create an post a small sample qvw file that demonstrates your data as closely as possible.

With the data provided above, I can't reproduce the issues with transaction where Basket "A" criteria1 is sold together with Basket "C" criteria2.

Maybe it's possible to achieve what you want, but I think if you want to create an AND condition for criteria 1 and 2, you probably need to use an intersection of sets, maybe something like

=count(

{$<

TransactionID = P({$<Criteria1={1}>} TransactionID) * P({$<Criteria2={1} >} TransactionID) 

>}

DISTINCT TransactionID)

Regards,

Stefan

Not applicable
Author

Yes thats exactly what i mean.

And no not necessarily. The criterias need not to be unique. Perhaps for better understanding an example:

Market Basket A could be coffee and Sugar, to see how often customers who buy coffee also buy sugar.

And basket B could be sugar or sweetening as criteria 1 and tee as the second criteria.

I created a strongly simplified demo-qvw, hope this helps.

Not applicable
Author

Hi,

I am not sure if I understood your question well but I gave it a try. Pls. see the attached file.

Best regards,

http://quickdevtips.blogspot.com/

Not applicable
Author

Hi Bilge,

thanks for your help. I try to implement your aggr expression in my existing data modell and will give you feedback whether it works or not

swuehl
MVP
MVP

Max,

in your provided sample, the criteria for Product 1020 are not unique. Basket A: Crit1, Basket B: Crit 2.

If this is possible, this (and the products missing in your transaction table but stated in the basket table) makes it quite hard to develop a set analysis expression here.

Please confirm that these are your requirements.

And no, the set analysis won't consider the dimension values of Basket dimension, so you won't get automatically the correct links between criteria and transactions. For few baskets, we can get around this, but for a very large amount of baskets, this will be difficult.

You upper right table in your first example file seems to give the correct answers, right? Then you can maybe elaborate a solution by using advanced aggregation, maybe like Bilge suggested (but I can't see an advanced aggregation expression in his file, am I missing something here?)

See attached for a version using advanced aggregation (sorry if I missed Bilges solution).

Not applicable
Author

Hi,

actually it is possible. It should not be common practice, but there is a chance that a product is main criteria in one basket and second criteria in another and so i added it to the demo. Also your second point is possible, that products are added to basket criterias but none are sold within the selected time period.

I actually try out Bilges expression and it seems to do it right so far, although it isn't the most performant solution, especially with a table with 90 million product sales - but nonetheless it works.

As soon as i have cross checked my results i mark his answer as correct

Bilges used the aggr statement as a dimension, i took his fomula wrapped with a sum as an expression. As far as i can tell by now you both take the same approach to the solution.

Thank you both many times for your help

swuehl
MVP
MVP

Ah, I missed looking at the (calculated) dimensions, sorry for that.