Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
gmuradov
Partner - Contributor III
Partner - Contributor III

Filtering on a chart which shows most items are frequently bought together

Hello, 

I am trying to make a barchart to show which 2, 3 or 4 items are most frequently bought together. 

My sales orders are stored in the table ps2_trans, the sales orders rows in ps2_rows where I have item_id and trans_id (id of the sales order).

This is my load script 
Select
t.id AS trans_id,
r.id AS row_id,
GROUP_CONCAT(r.item_id ORDER BY r.item_id SEPARATOR ' == ') AS concat_item_ids,
GROUP_CONCAT(i.nm->>"$.bg" ORDER BY r.item_id SEPARATOR '; ') AS concat_items,
COUNT(r.id) AS rows_cnt,
r.item_id as item_id
FROM
ps2_trans t,
ps2_rows r,
ps2_items i
WHERE
t.id=r.trans_id AND
r.item_id=i.id
GROUP BY
t.id
HAVING
rows_cnt > 1;

this is my dimension: =IF(LEN(concat_items) - LEN(REPLACE(concat_items, ';', '')) = 1(2 or 3), concat_items)
this is the measure: COUNT(trans_id)

since i am using group by t.id in my query therefore i cannot apply any filters on my bar chart (brand, category, item) 
can anybody help with a more intelligent way to solve this problem?

Labels (1)
12 Replies
marcus_sommer

You may try it only with a calculated UI dimension like:

aggr(concat(Item, ' + '), t.id)

and as expression:

count(t.id)

gmuradov
Partner - Contributor III
Partner - Contributor III
Author

unfortunately this doesn't work

marcus_sommer

What happens and in which way is it not like your expectation?

gmuradov
Partner - Contributor III
Partner - Contributor III
Author

Well it just didnt give any results.
Just to be clear what I am trying to achieve is this:
Ex: I have 6 sales orders:
SO1 
- 1 apple 
- 1 pear
- 1 orange
SO2 
- 1 apple 
- 1 orange
SO3 
- 1 apple 
- 1 orange
- 1 banana
SO4 
- 1 apple 
- 1 pear 
SO5 
- 1 pear
- 1 banana
SO6
- 1 apple 
- 1 orange 
- 1 banana
- 1 pineapple
 
The expected result is:
 
Which 2 items are bought most frequently together:
1 - apple, orange - 4 times bought together
2 - apple, pear - 2 times bought together
3 - orange, banana - 2 times bought together
3 - apple, banana - 2 times bought together
etc.
 
Which 3 items are bought most frequently together:
1 - apple, orange, banana - 2 times bought together
2 - apple, pear, orange - 1 times bought together
3 - apple, pineapple, banana - 1 times bought together
4 - pineapple, banana, orange - 1 times bought together
etc. 
marcus_sommer

My suggestion takes always all items from an order and not 2 or 3 or n ones in any combination which is much more complex. IMO my approach isn't not bad for the needed efforts and it provides already some insights.

I assume that your requirement will need n cartesian loop-logic - each item against each other item from an order to be able to apply such count-approach. Only aggregations regardless if in script or UI won't be enough.

If I had this task I would use Excel with this 6 orders to develop a working logic. If it worked I would extend it to n more orders as well as to a few dependencies / context-parts - to prove the working and to get an imagination to the needed efforts and performance-impact by transferring the logic into the script and/or the UI. I think it won't be trivial ...

Qrishna
Master
Master

sample data pls?

gmuradov
Partner - Contributor III
Partner - Contributor III
Author

I got this, but on my app the data will come form a db

marcus_sommer

The source itself isn't important.

It looked that there are a lot of items - therefore I suggest to consider to apply the logic against an upper categorization of the items and/or to restrict the bought together views on certain categories - because otherwise the cartesian data-set might become quite (too) large. In each case it would prove if the logic itself is working and provides the wanted insights.

gmuradov
Partner - Contributor III
Partner - Contributor III
Author

what I uploaded is like a 0.05% of the total data