Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I was working on the below dataset sometime back and johnw has helped me develop a frequency and percentage matrix for the dual combination of these attributes. original post here: Create frequency count crosstab matrix and for each combination calculate percentages of target var
copying the content from original post here:
I have the following dataset
ID attribute1 attribute2 attribute3 BAD (Y-> yes)
1 1 0 0 Y
2 1 1 0 N
3 1 0 1 Y
4 0 1 1 N
5 1 1 0 Y
6 1 1 0 N
7 1 0 1 Y
8 1 0 1 N
I want to create frequency count matrix, for each combination of attribute tripped (atrribute = 1 means tripped) like following:
attribute1 attribute2 attribute3
attribute1 _ 3 3
attribute2 3 _ 1
attribute3 3 1 _
It represents counts of attributes tripped together.
Once i have the counts I need to calcualte %BAD in each combination like below.
attribute1 attribute2 attribute3
attribute1 _ 12.50% 25%
attribute2 12.50% _ 0%
attribute3 25% 0% _
it represents out of 3 ID's for the combination atribute1 and atribute2 tripped, one ID went BAD -> so 12.5% BAD
And out of 3 ID's for the combination attribute1 and attribute1 tripped, two went BAD -> so 25% BAD
And the 1 ID for the combination attribute2 and attribute3 tripped, didnt went BAD -> so 0% BAD
Currently I am trying to do this in a report/table layout rather than a matrix layout for every possible combination of the attributes. I have the below dataset.
My dataset look like this:
ID | grade | type | score range | attribute1 | attribute2 | attribute3 | attribute4 | attribute5 | BAD (Y-yes/N- No) |
1 | A | 1 | 100-110 | 1 | 0 | 0 | 1 | 0 | Y |
2 | A | 1 | 120-130 | 1 | 1 | 0 | 1 | 1 | N |
10 | A | 1 | 100-110 | 1 | 0 | 0 | 1 | 0 | N |
3 | B | 2 | 70-80 | 1 | 0 | 1 | 1 | 0 | Y |
4 | C | 3 | 30-40 | 0 | 1 | 1 | 0 | 1 | N |
5 | A | 3 | 40-50 | 1 | 1 | 0 | 1 | 1 | Y |
6 | B | 4 | 80-90 | 1 | 1 | 0 | 1 | 1 | N |
7 | C | 2 | 40-50 | 1 | 0 | 1 | 1 | 0 | Y |
8 | C | 4 | 30-40 | 1 | 0 | 1 | 1 | 0 | N |
Report I'm trying to build: Goal is to build this report to show frequencies and also percentages.
A1 | A2 | A3 | A4 | A5 | A1, A2 | A1, A3 | A1, A4 | A1, A5 | A2, A3 | A2, A4 | A2, A5 | A3, A4 | A3, A5 | A4, A5 | A1, A2, A3 | A1, A2, A4 | A1, A2, A5 | A1, A3, A4 | A1, A3, A5, | A1, A4, A5 | A1, A2, A3, A4 | A1, A2, A4, A5 | A1, A3, A4, A5 | A1, A2, A3, A4, A5 | |||
BAD | grade | score range | |||||||||||||||||||||||||
yes | A | 100-110 | 1 | 1 | 1 | ||||||||||||||||||||||
110-120 | |||||||||||||||||||||||||||
120-130 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |||||||||||||
B | 90-100 | ||||||||||||||||||||||||||
80-90 | |||||||||||||||||||||||||||
70-80 | |||||||||||||||||||||||||||
C | 40-50 | ||||||||||||||||||||||||||
30-40 | |||||||||||||||||||||||||||
No | A | 100-110 | 2 | 1 | 2 | 2 | 1 | 1 | |||||||||||||||||||
110-120 | |||||||||||||||||||||||||||
120-130 | |||||||||||||||||||||||||||
B | 90-100 | ||||||||||||||||||||||||||
80-90 | |||||||||||||||||||||||||||
70-80 | |||||||||||||||||||||||||||
C | 40-50 | ||||||||||||||||||||||||||
30-40 |
Thanks