Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Analysis with Diagram

Hello everbody,

i have the following problem:

My data is sorted into different orders which contain various binary attributes.

qv orders.bmp

I want to display the data in a graphic which shows the corrleations of single attributes to other single attributes.

The size of the Bubble should be a measurement for correlation. The problem is, that i have the same axe in one diagram twice.

qv Forum 2.bmp

I tried it with the Matrixdiagram and the Pointdiagram.

The Problem of the Pointdiagram is that its axes are  formulas and I am not able to categorize to the specific attributes.

The Pointdiagram looks pretty fine, but im not able to insert the same dimension twice.

It would be very nice if you can give me any hints or tipps or just the side of the manual

Down is displayed the logic for the correlation:

qv forum 1.bmp

1 Solution

Accepted Solutions
swuehl
MVP
MVP

Sodele,

=if(SACode<>SACode5,

count(aggr(NODISTINCT if(ProdNummer=%ProdNummer5,1),ProdNummer,%ProdNummer5)) / count(ProdNummer)

)

View solution in original post

11 Replies
swuehl
MVP
MVP

One possible approach is to load your dimension twice, something like

Transactions:

LOAD

recno() as TransID,

ceil(RAND()*100) as PersonID,

chr(floor(RAND()*10+65)) as Category

AutoGenerate 150;

LOAD

PersonID,

Category as Category2

resident Transactions;

if you want to use Category for axis. You can easily relabel Category2 to Category in dimension tab.

See also attached,

Stefan

Not applicable
Author

Thank you very much swuehl for the fast reply 😃

I'll try to load the dimension twice.

But there is another cirumstand:

There is a difference between the Correlation A to B and B to A.

example:

order1: a , b , c

order2 a , b , d

order3 a , c , d

now is A to B 2/3 ~ 66 % correlation

          B to A is 100% correlation

how can I implemt this into my model?

Thank you very much 😃

Not applicable
Author

Another problem is that i have a 100% correlation between the identical attributes:

How can I exclude them from the selection?

qv 4.bmp

swuehl
MVP
MVP

Maybe like attached?

Not applicable
Author

Thank you very much swuehl.

I am very impressed that you get the right solutions that fast! Im working with QV for about 1 Month but you are amazing ! =). The model is exactly what i need - Respect.

Im trying to adopt your model to my main model, but i have a question im struggeling with for hours..

In the picture i described your formula with my comments, in red is the part of the formula i don't understand.

It would be very helpful if you can explain that part to me, or correct me if i explained something wrong.

qv5.bmp

swuehl
MVP
MVP

You got it mostly right, except the part with the aggr() dimensions.

In my expression

if(Bin<>Bin2,count(aggr(NODISTINCT if(Order=Order2,1),Bin,Bin2,Order,Order2)) / count(Order))

I used an advanced aggregation with expression if(Order=Order2,1) and dimensions Bin,Bin2, Order, Order2.

You can imagine the advanced aggregation like a table inside the expression.

In fact, since we are already using Bin and Bin2 as dimesions to our chart, you don't need to state them here (but it won't harm, the results of this aggr() table will be automatically limited to the values of the outer dimensions.

Alternatively, if you keep all four dimensions, you might remove the NODISTINCT qualifier.

So if you look at one cell in our chart, e.g. Bin=a and Bin2=b, the aggr() will iterate over all possible combinations of Order and Order2 values and check the expression, which results in two records with 1:

Order     Order2          Bin   Bin2          count(if(Order2=Order,1))

                                                              2

order1     order1          a          b               1

order1     order2          a          b               0

order2     order1          a          b               0

order2     order2          a          b               1

order3     order1          a          b               0

order3     order2          a          b               0

Then you devide by the count of Order to get the percentage (which is 3, not 6, because the count is outside the aggr(), so not all combinations are taken into account). So we get a result for our chart of 2/3.

To iterate over all possible combinations, it is important that our tables are not linked, this is an important change to my first example.

I hope I made myself somewhat clear.

Regards,

Stefan

Not applicable
Author

Hello Stafan,

thank you very much for your solution.

But i wasn't able to assign it to my solution.

I uploaded my model, the structure is a bit different. It would be very helpfull for me because im searching the mistake for about 2 days =(.

In my model are 3 Pages, the first for the overview, the second is a time analysis and the third the correlation.

In the correlation tab i tried to implement your second solution, but it didnt work.

In the time correlation i search for the correct expression. in the first graph is shown the sold orders. In the second graph i want to display the percentage of single elements of the order.

When i display it absolutely the singel elemets follow the deviation of the total sales so i cant get any information out it.

So i want to divide it through the sales to get a percentage. But it divides only through the sales which contain that specift element so i get a correlation from 100%.

Thank you very much for your expert knowledge

greetings marko

swuehl
MVP
MVP

I think you need to keep the Prodnummer also in your data island table for the correlation, not only the SACode.

Something like

//Laden unverknüpft für Korrelation Signifikanz

LOAD

[Produktionsnummer] as %ProdNummer5,

text(subfield(purgechar([Prod.codes],'" '),'#')) as SAcode5

FROM

(txt, codepage is 1252, embedded labels, delimiter is ';', no quotes);

Then you should be able to change

=if(SACode<>SAcode5,count(aggr(NODISTINCT if(ProdNummer<>ProdNummer,1),Prodnummer,) / count(ProdNummer))

to

=if(SACode<>SACode5,

count(aggr(NODISTINCT if(ProdNummer=ProdNummer5,1),ProdNummer,ProdNummer5)) / count(ProdNummer)

)

Not applicable
Author

I implemented the changes but i didnt mange it to run.

I added the data file so you can start imports..

It would be very nice if you could help me again 😃