Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
phoenix
Creator
Creator

Create frequency count crosstab matrix and for each combination calculate percentages of target var

Hello experts,

I have the following dataset
ID   attribute1   attribute2   attribute3     BAD (Y-> yes)
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N
I want to create frequency count matrix, for each combination of attribute tripped (atrribute = 1 means tripped) like following:
                          attribute1    attribute2    attribute3
attribute1                  _                3                  3
attribute2                 3                 _                 1
attribute3                 3                 1                  _
It represents counts of attributes tripped together.
Once i have the counts I need to calcualte %BAD in each combination like below.

                      attribute1     attribute2     attribute3
attribute1             _               12.50%         25%
attribute2        12.50%             _                   0%
attribute3            25%              0%                _
it represents out of 3 ID's for the combination atribute1 and atribute2 tripped, one ID went BAD -> so 12.5% BAD
And out of 3 ID's for the combination attribute1 and attribute1 tripped, two went BAD -> so 25% BAD
And the 1 ID for the combination attribute2 and attribute3 tripped, didnt went BAD -> so 0% BAD
Could anyone please let me know if this could done in Qlikview?
Thanks
1 Solution

Accepted Solutions
johnw
Champion III
Champion III

Capture.PNG

Maybe there's a cleaner way, but this is what I threw together:

Cross:
CROSSTABLE (AttributeX,Flag)
LOAD
ID
,Attribute1
,Attribute2
,Attribute3
RESIDENT Raw
;
INNER JOIN (Cross)
LOAD 1 as Flag
AUTOGENERATE 1
;
DROP FIELD Flag;

LEFT JOIN (Cross)
LOAD
ID
,AttributeX as AttributeY
RESIDENT Cross
;
INNER JOIN (Cross)
LOAD *
RESIDENT Cross
WHERE AttributeX <> AttributeY
;

And the chart:

Dimension 1 = AttributeX
Dimension 2 = AttributeY
Expresson = count({<BAD={'Y'}>} distinct ID)/count(total distinct ID)

View solution in original post

6 Replies
johnw
Champion III
Champion III

Capture.PNG

Maybe there's a cleaner way, but this is what I threw together:

Cross:
CROSSTABLE (AttributeX,Flag)
LOAD
ID
,Attribute1
,Attribute2
,Attribute3
RESIDENT Raw
;
INNER JOIN (Cross)
LOAD 1 as Flag
AUTOGENERATE 1
;
DROP FIELD Flag;

LEFT JOIN (Cross)
LOAD
ID
,AttributeX as AttributeY
RESIDENT Cross
;
INNER JOIN (Cross)
LOAD *
RESIDENT Cross
WHERE AttributeX <> AttributeY
;

And the chart:

Dimension 1 = AttributeX
Dimension 2 = AttributeY
Expresson = count({<BAD={'Y'}>} distinct ID)/count(total distinct ID)

phoenix
Creator
Creator
Author

Thanks John, it works great.

I have a pretty big dataset of several other variables that I have to create filters for.

Wondering if there are more variables to toggle like below and user has to select desired

model->category->bin->....  to calculate the Freq and BAD rate matrix. If there is any other method that you could think of, please let me know.

ID   model     category  bin   attribute1   attribute2   attribute3     BAD (Y-> yes)

1     1                 a         1-2         1           0               0                 Y

2     1                 a         1-2         1           1               0                 N

3     1                 b         1-2         1           0               1                 Y

4     2                 c         1-2          0           1               1                 N

5     2                 a         3-4         1           1               0                 Y

6     3                 b         3-4         1           1               0                 N

7     3                 a         4-5        1           0               1                 Y

8     4                 c         3-4        1           0               1                 N

johnw
Champion III
Champion III

Add as many attributes as you want. Same script. Seems to produce the results I would expect. If they aren't right, what results do YOU want when making selections on this data?

phoenix
Creator
Creator
Author

Thank you! Tried it on my data, works great!

One last question. I tried using colormix on the resulting pivot table to create a heatmap. It works fine for frequency matrix using the expression

=RGB(255,

     255*(1-frequency/100),

     255*(1-frequency/100)

    )

Here 'frequency' is count(distinct Application_Number)

When I tried to use same logic to create heatmap for Bad rate matrix using below expression I am not getting desired results

=RGB(255,

     255*(1-rate/100),

     255*(1-rate/100)

    )

Here 'rate' is count({<EVER_PST_DUE_60_FLG={'Y'}>} distinct Application_Number)/count(total distinct Application_Number)

Any suggestions on this?

johnw
Champion III
Champion III

Your data is completely different in the second table. Imagine you had data like that in the first table - it's all less than 1, so it would all be white, just like 1 is in your first table. And that's what you see. You need a new expression for the second table. You could probably use the wizard again based on THIS data set instead of on the previous. It might come up with something like this.

RGB(255,255*(1-rate*500),255*(1-rate*500))

phoenix
Creator
Creator
Author

Got it. That expression works.Thanks a lot for your help on this.