Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Independent validation for trusted, AI-ready data integration. See why IDC named Qlik a Leader: Read the Excerpt!
cancel
Showing results for 
Search instead for 
Did you mean: 
juliekagergaard
Contributor II
Contributor II

Using the same dimension for both axis in pivot crosstable

Hey QlikView Community

I'm trying to create a matrix where the two axis are the same field, and the calculation is of how similar each object is to another determined by a certain algorithm. I.e. something like:

ABC
A100%30%55%
B30%100%70%
C55%70%100%

But I cannot get each value to be equal to all values. This means I can only get it to evaluate down the middle diagonal (i.e. the 100% values). Like:

ABC
A100%MissingMissing
BMissing100%Missing
CMissingMissing100%

As the algorithm is determined by an aggregation over a set of attribute fields from another connected table, it is not possible to just create a alias field with the second names. How do I get A, B, and C to be an option of A, so that the A-B, A-C and so forth calculations are not just missing?

1 Solution

Accepted Solutions
juliekagergaard
Contributor II
Contributor II
Author

Still not entirely it, but this helped me get it right!

I extracted my data so I have one field of equipment and then one field per area with the count of locations per equipment. Using crosstable I was then able to create a table with equipment, area and location count.

I saved this table in a temp qvd then loaded it in twice in qualified versions as table_1 and table_2 with equipment unqualified.

The pivot table then has the dimensions table_1.area and table_2.area, with the table_2.area field across.

The expression is then the following:

=  If( Table_1.area = Table_2.area, 100,  Num((Sum( Aggr((Sum(  Table_1.No_Of_Locs) + Sum(  Table_2.No_Of_Locs)) * (Count(  if( Table_1.No_Of_Locs <> 0, 1)) + Count(  if( Table_2.No_Of_Locs <> 0, 1))),  Table_1.area, Table_2.area, Equipment)))

/( Sum(Table_1.No_Of_Locs) + Sum(Table_2.No_Of_Locs) )

/ (Count( if( Table_1.No_Of_Locs <> 0, 1)) + Count( If( Table_2.No_Of_Locs <> 0, 1))) * 100, '##,00'))

As the algorithm cannot calculate the similarity between two similar areas (it will not end up being 100%), the if statement sets this value. The num() function helps limit the number of decimalpoints.

View solution in original post

10 Replies
Lisa_P
Employee
Employee

can you share your app with us ?

juliekagergaard
Contributor II
Contributor II
Author

Unfortunately not, the data is classified. The expression I'm using to calculate is the following:

Num((Sum( Aggr(Count( {[System]} DISTINCT Location) * Count( {[System]} DISTINCT Area), Category_1, Category_2, Category_3))

/ Sum( Aggr(Count( {[System]} DISTINCT Location), Category_1, Category_2, Category_3)))

/Sum( Aggr( Count( {[System]} DISTINCT Area), Category_1, Category_2, Category_3))*100, '#,00')

I'm calculating the commonality among areas of equipment. The areas are the ones indicated by A, B, and C in my original post. The categories are descriptions of the locations of equipment. {[System]} refers to a listbox, which is a dynamic choice of which collection of equipment is being viewed.

When I try to reload the area names under a different field name (e.g. Area_Name_2) there is no connection to the equipment descriptions, because the field that connects the original area field to the table with the descriptions is not loaded in. When I do load the connecting field I get the same problem as in the second table, because now they are once again only linked A-A, B-B, and C-C.

ogautier62
Specialist II
Specialist II

Hi

yes could you provide some data and the algoritm

it should be possible with an outer join of table on itself in script first

regards

ogautier62
Specialist II
Specialist II

Hi

just a dummy example,

in your matrix above, how do tou count the % (in english not in aggr) !

for A-B by example : 30% is count of B with commun category with A / count (A+B) ?

thanks

juliekagergaard
Contributor II
Contributor II
Author

I multiply the number of locations with the number of areas for each combination of category 1, 2, and 3, and then sum those values together. This is then divided by the sum of the number of locations per combination of category 1, 2, and 3.

As the boundary of this calculation is the sum of the number of areas per combination of the categories, I then divide by this sum to get the percentage of the max possible amount.

If all locations are the same in two areas, the percentage should end at 100.

ogautier62
Specialist II
Specialist II

Hi,

this a first step to make a cross table of area on itself.

in script load twice data :

qualify *;

data:

LOAD  * FROM your data source

data2:

LOAD  * FROM your data source

unqualify *;

with these dummy data :

then as dimensions :

data.area

data2.area

a) first solution (asymetric) :

expression :

sum(aggr( max( if(data.category=data2.category,1,0)), data2.area,data.ident))

/ count(total <data.area> data.ident)

b) second (symetric) :

(sum(aggr( max( if(data.category=data2.category,1,0)), data2.area,data.ident)) +

sum(aggr( max( if(data.category=data2.category,1,0)), data.area,data2.ident)) )

/ (count( total <data.area> data.ident)+ count( total <data2.area> data2.ident))

without an example, I'm not sure to understand exact calculation

regards

juliekagergaard
Contributor II
Contributor II
Author

That is unfortunately not entirely how the categories work.

The categories (1, 2, and 3) are based on an ISO-Standard and go together to describe the equipment at a location. To simplify, I have created a dummy data set where the categories are collected in one field. The data is structured as below:

Equipment*

Area A

Area B

Area C

Valve

3

3

6

Crane

0

1

0

Motor

2

3

0


Where the numbers in the area columns is the number of locations for each piece of equipment on each of the areas.

To calculate the similarity across each of the areas the algorithm comes in.

E.g. to calculate the similarity between area A and B in excel it would look like:

Equipment

Area A

Area B

Sum of locations

No of areas with equipment

Locations * Areas

Valve

3

3

6

2

12

Crane

0

1

1

1

1

Motor

2

3

5

2

10

Sum

12

5

23

The commonality index would then be:

The sum of Locations * areas divided by the sum of the locations, i.e. 23/12 = 1,9. The maximum value of the calculation is the sum of the number of areas with the equipment, i.e. 5. This means, that the commonality percentage ends at 1,9/5*100 = 38%.

The percentage is what I want shown in the matrix.

The matrix should end up symmetric, as the commonality between A-B and B-A should be the same.

I have attached a QlikView file where the dummy data is loaded in.

* Using the ISO-Standard the names would be something like, "Mechanical, Valve, Manual Valve" for category 1, Category 2, and Category 3 respectively. The names here are simplified.

ogautier62
Specialist II
Specialist II

Hi Julie,

thank's for the complements, but unfortunately I can't read qvw I've free version QV.

I made test with these data :

to make a cross product pivot table, in script :

qualify *;

data:

LOAD  * from your data source;

data2:

LOAD  * from your data source;

unqualify *;

equip:

load distinct data.equipment resident data;

area:

load distinct data.area resident data;

outer join (equip) load * resident area;

drop table area;

exit script;

then make a pivot table with as dimension :

data.area

data.equipment

data2.area

I retrieve 38% for A/B :

expressions :

result :

=locxarea/totarea/[totH+totV]

locxarea :

sum(total <data.area,data2.area> aggr((if(count(distinct data.id)>0,1,0) +

                           if(aggr(sum(if(data2.equipment=data.equipment,1,0)),data.area,data.equipment,data2.area)>0,1,0))

* (aggr(sum(if(data2.equipment=data.equipment,1,0)),data.area,data.equipment,data2.area)

+ count(distinct data.id)),data.area,data.equipment,data2.area)

     )

totarea :

sum(total <data.area,data2.area> aggr(count(distinct data.id),data.area,data.equipment,data2.area))

+ sum(total <data.area,data2.area> aggr(sum(if(data2.equipment=data.equipment,1,0)),data.area,data.equipment,data2.area))

[TotH + TotV] :

sum(total <data.area> if(aggr(count(distinct data.id),data.area,data.equipment)>0,1,0)) +

sum(total <data.area,data2.area> aggr(max(if(aggr(sum(if(data2.equipment=data.equipment,1,0)),data.area,data.equipment,data2.area)>0,1,0)),data.area,data.equipment,data2.area))

I hope you can continue with this,

otherwise attach just few rows in xslx

regards

ogautier62
Specialist II
Specialist II

could you attach too some aspirin 😉 ?