Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have a simple table consisting of invoice ids and invoice fruits.
| Invoice ID | Fruit |
| A | Apple |
| A | Orange |
| A | Plum |
| A | Tomato |
| B | Peach |
| B | Apple |
| B | Plum |
| C | Kumquat |
| C | Grapes |
| C | Orange |
| C | Pear |
| C | Plum |
Now I want to put the co-occurrences of all fruits into a pivot table where each co-occurrence is based on the distinct number of invoices where they co-occur. So, using Apples and Plums as an example, the apple/plum or plum/apple combination would appear on 2 invoices (A and B) while the apple/orange or orange/apple would appear on 1 invoice (A only). The straight table result set would look something like the following table. I do not want to hard-code fruits in my set analysis. I just want to be able to take the set of data and return the distinct invoice count based on all of the co-occurrences. To date I have not come up with anything using set analysis. Maybe there is something elegant that eludes me. Any ideas that may be helpful? Thank you.
| Item | Co-occuring Item | Count of Invoices |
| Apple | Orange | 1 |
| Apple | Peach | 1 |
| Apple | Plum | 2 |
| Apple | Tomato | 1 |
| Grapes | Kumquat | 1 |
| Grapes | Orange | 1 |
| Grapes | Pear | 1 |
| Grapes | Plum | 1 |
| Kumquat | Grapes | 1 |
| Kumquat | Orange | 1 |
| Kumquat | Pear | 1 |
| Kumquat | Plum | 1 |
| Orange | Apple | 1 |
| Orange | Grapes | 1 |
| Orange | Kumquat | 1 |
| Orange | Pear | 1 |
| Orange | Plum | 2 |
| Orange | Tomato | 1 |
| Peach | Apple | 1 |
| Peach | Plum | 1 |
| Pear | Grapes | 1 |
| Pear | Kumquat | 1 |
| Pear | Orange | 1 |
| Pear | Plum | 1 |
| Plum | Apple | 2 |
| Plum | Grapes | 1 |
| Plum | Kumquat | 1 |
| Plum | Orange | 2 |
| Plum | Peach | 1 |
| Plum | Pear | 1 |
| Plum | Tomato | 1 |
| Tomato | Apple | 1 |
| Tomato | Orange | 1 |
Regards,
Mark
Want to try this? ...
source_data:
load
*
from https://community.qlik.com/t5/App-Development/Co-occurrence-table-Challenge/td-p/1853232
(html, utf8, embedded labels, table is @1)
;
left join(source_data)
load
[Invoice ID],
Fruit as Co_occuring_item
resident source_data
;
remove_dup:
noconcatenate load *
resident source_data
where Fruit <> Co_occuring_item;
drop table source_data;
Table dimensions: Fruit & Co_occuring_item
Measure: Count(distinct [Invoice ID])
Hi Steve,
Thank you for your quick response! I would normally consider this approach or some cross join attempt or even having it done very early at the SQL server level, but I am inheriting an app where, although not ideal, it would be preferable to do it in set analysis. If such an approach is not possible, then I will retool using a scripting approach.
If anyone has suggestions, let me know.
Thanks,
Mark
As far as I know this can't be done in set analysis alone. You will need to do something in the load script to create a secondary Fruit field. But you can leave it open to see if anyone else has an idea on a trick.
I'm almost positive sure you won't be able to get another Fruit to field that is a cross join with original without doing in the load script, but i've seen crazier tricks done by others so maybe...