
Counting instances
Karl Pover Sep 22, 2010 5:33 PM (in response to bryankoch)This would be the equivalent to COUNTBLANK in QlikView:
count(if(trim([Diag1])='' or isnull([Diag1]),[Diag1]))
because count(null()) = 0 and the if statement returns null if the condition is not met and you don't define a value when the condition is false.
so to count how many rows have values in Diag1 and Diag2 you could try
count(ICDNumber)

count(if(trim([Diag1]&[Diag2])='' or (isnull([Diag1]) and isnull([Diag2])),[Diag1]&[Diag2]))
Regards.

Counting instances
bryankoch Sep 22, 2010 5:54 PM (in response to Karl Pover )Hmm
What kind of dimension would you recommend I set up with that?

Counting instances
John Witherspoon Sep 22, 2010 5:59 PM (in response to bryankoch)I'm sorry; I don't understand the question. If you want a count of the number of rows with those three codes, that's just a single number. There's no dimension implied. You can have a dimension if you want. I just wouldn't have any idea what dimension you might want.

Counting instances
bryankoch Sep 22, 2010 6:02 PM (in response to John Witherspoon )well at some point I would like to have them grouped.
I think for now, selecting which chart to show this in is the better question for me. Because I cannot get any output to display for either answer.

Counting instances
bryankoch Sep 22, 2010 6:05 PM (in response to bryankoch)I guess to better explain, the output I am looking for is
Combo Count
12
13
14
23
34
123
234
... and so forth
And somewhere down the line be able throw a sort to where I can look at persay all the combos in a specific Taxonomy or all the combos for a specific Diagnosis Category.

Counting instances
John Witherspoon Sep 22, 2010 6:43 PM (in response to bryankoch)Ah, so you want to see every POSSIBLE combination of at least two diagnostic codes? All of these? As values for a dimension?
12
13
14
23
24
34
123
124
134
234
1234I don't know of any good way to make QlikView iterate over all the combinations of a dimension's values as a dimension. Below is pure brute force, but it might work, particularly since the number of combinations is small enough to get away with this:
AllCombinations:
CROSSTABLE (Combination,Exists)
LOAD
"Taxonomy Category"
,len(Diag1) and len(Diag2) as "12"
,len(Diag1) and len(Diag3) as "13"
...
,len(Diag1) and len(Diag2) and len(Diag3) and len(Diag4) as "1234"
RESIDENT RawDiagnosisTable
;
INNER JOIN (AllCombinations)
LOAD 1 as Exists
AUTOGENERATE 1
;
DROP FIELD Exists
;At this point, AllCombinations should be a linkage table between the Taxonomy Category and all existing combinations of your diagnosis codes. At that point, you should be able to do this:
Dimension = Combination
Expression = count(distinct "Taxonomy Category")All of this is very untested. If it doesn't work like I wrote it and you can't fix it, maybe upload a file with some sample data for me to fiddle with.

Counting instances
bryankoch Sep 23, 2010 12:42 PM (in response to John Witherspoon )Well, creating the dimensions worked perfectly:
AllCombinations:
Replace
CROSSTABLE (Combination,Exists)
LOAD "Taxonomy Category",
len(Diag1) and len(Diag2) as "12",
len(Diag1) and len(Diag3) as "13",
len(Diag1) and len(Diag4) as "14",
len(Diag2) and len(Diag3) as "23",
len(Diag2) and len(Diag4) as "24",
len(Diag3) and len(Diag4) as "34",
len(Diag1) and len(Diag2) and len(Diag3) as "123",
len(Diag1) and len(Diag2) and len(Diag4) as "124",
len(Diag1) and len(Diag3) and len(Diag4) as "134",
len(Diag2) and len(Diag3) and len(Diag4) as "234",
len(Diag1) and len(Diag2) and len(Diag3) and len(Diag4) as "1234"
RESIDENT RawDiagnosisTable;
INNER JOIN (AllCombinations)
LOAD 1 as Exists AUTOGENERATE 1;
DROP FIELD Exists;However, it's not counting correctly.
When I use distinct it returns 6 (the number of taxonomies I have currently loaded in the file) and when I remove it, enters in the total in every entry for the dimension. I thought perhaps it was because that Taxonomy is now a "key" value and wasn't counting the correct field, but turns out that wasn't the case I don't think.
I did attach the excel file (a much smaller version) but there's 100 rows from 3 different taxonomies.

Sample.xlsx 20.0 K

Counting instances
bryankoch Sep 23, 2010 12:47 PM (in response to bryankoch)Disregard that previous sample, use this one instead:

Sample.xlsx 19.9 K


Counting instances
bryankoch Sep 23, 2010 12:47 PM (in response to bryankoch)So for that data, we should see these kinds of numbers:
Number with 1 45 Number with 2 pairs 201 Number with 3 pairs 34 Number with 4 pairs 19 Sum 299
Currently I'm getting a total of 546 using the expression count(combination)

Counting instances
bryankoch Sep 23, 2010 2:29 PM (in response to bryankoch)I think I know the reason why we're seeing 546 counts instead of 299.
Consider a row that has entries in Diag1, Diag2, Diag3, Diag4.
I have a suspicion that for this row (and others), it is counting it for all combinations, thus a count for
12
23
34
14 ... and so forth. In other words, the row is not given a distinct label.
How do we remedy that?

Counting instances
John Witherspoon Sep 23, 2010 2:58 PM (in response to bryankoch)Your suspicion is correct. That's what I was TRYING to do, and what made everything complicated. OK, just because we've come this far, to do what I was thinking with what your data really looks like, you just need to add an ID for the row, and to use the ID instead of the Taxonomy Category when building the extra table. I have that working just fine in an example with your data.
But I didn't realize you wanted a DISTINCT label for the row. That's why I have a separate table. I thought you WANTED to see all those combinations. If you only want a distinct label for the row, that's TRIVIAL by comparison.
if(len(Diag1),'1')&if(len(Diag2),'2')&if(len(Diag3),'3')&if(len(Diag4),'4') as Combination
Is that what you mean?
And actually, it looks like you ALWAYS fill in the codes from left to right? If so, then the ONLY possible combinations are these:
1
12
123
1234NOT the ones we both listed before. And at that point, all you're doing is counting the number of diagnostic codes. You might as well just do this:
rangesum(len(Diag1)>0,len(Diag2)>0,len(Diag3)>0,len(Diag4)>0) as NumberOfCodes
Here's every single one of those approaches. Maybe what you're asking for is one of them. Maybe not.

testCompareCodes.qvw 138.8 K

Counting instances
bryankoch Sep 23, 2010 3:24 PM (in response to John Witherspoon )Well I can't view your document unfortunately,
But yes I do want to see all of the combinations because we don't know a great deal about the data we're playing with. Maybe for this 300 that's how it worked out, but for the other 2 mil+ it might not be, maybe it's different in another taxonomy, I don't know yet.
You gave me an idea though, which is what I think you were trying to tell me in your last post.
RawDiagnosisTable:
LOAD [Taxonomy Category],
[Diag1 Cat], [Diag2 Cat], [Diag3 Cat], [Diag4 Cat],
Diag1, Diag2, Diag3, Diag4,
if(len(Diag1),'1')&if(len(Diag2),'2')&if(len(Diag3),'3')&if(len(Diag4),'4') as Combination
FROM C:\Users\kkorynta\Desktop\Sample.xlsx
(ooxml, embedded labels, table is Sheet1);
Load * Inline [Combination
12,
13,
14,
13,
14,
23,
24,
34,
123,
124,
134,
234,
1234
];My output is a little funky though which I see why you mentioned it looked like it only loads from left to right.
Combination Count (Combination) 307 1 45 12 201 13 1 14 1 23 1 24 1 34 1 123 34 124 1 134 1 234 1 1234 19 What I don't understand, is how/why the combos with a count of 1 are reported that way, shouldn't they show as zeros? Otherwise, yes this is what I was looking for.

Counting instances
Karl Pover Sep 23, 2010 3:29 PM (in response to bryankoch)To make up for my poor answer yesterday, here's some script you can try:
SET ThousandSep=',';
SET DecimalSep='.';
SET MoneyThousandSep=',';
SET MoneyDecimalSep='.';
SET MoneyFormat='$#,##0.00;$#,##0.00';
SET TimeFormat='hh:mm:ss TT';
SET DateFormat='DD/MM/YYYY';
SET TimestampFormat='DD/MM/YYYY hh:mm:ss[.fff] TT';
SET MonthNames='ene;feb;mar;abr;may;jun;jul;ago;sep;oct;nov;dic';
SET DayNames='lun;mar;mié;jue;vie;sáb;dom';Tmp_Mapeo:
LOAD * INLINE [
Combination_Num, Map
1, 1 Pair
2, 1 Pair
3, 1 Pair
4, 1 Pair
12, 2 Pair
13, 2 Pair
14, 2 Pair
23, 2 Pair
24, 2 Pair
34, 2 Pair
123, 3 Pair
124, 3 Pair
234, 3 Pair
1234, 4 Pair
];RawDiagnosisTable:
LOAD rowno() as RowID,
[Taxonomy Category],
Diag1,
[Diag1 Cat],
Diag2,
[Diag2 Cat],
Diag3,
[Diag3 Cat],
Diag4,
[Diag4 Cat]
FROM
Sample.xlsx
(ooxml, embedded labels, table is Sheet1);AllCombinations:
Replace
CROSSTABLE (Combination,Exists,2)
LOAD RowID,
"Taxonomy Category",
len(Diag1) and not len(Diag2) and not (len(Diag3)) and not (len(Diag4)) as "1",
not len(Diag1) and len(Diag2) and not (len(Diag3)) and not (len(Diag4)) as "2",
not len(Diag1) and not len(Diag2) and (len(Diag3)) and not (len(Diag4)) as "3",
not len(Diag1) and not len(Diag2) and not (len(Diag3)) and (len(Diag4)) as "4",
len(Diag1) and len(Diag2) and not (len(Diag3)) and not (len(Diag4)) as "12",
len(Diag1) and not len(Diag2) and (len(Diag3)) and not (len(Diag4)) as "13",
len(Diag1) and not len(Diag2) and not (len(Diag3)) and (len(Diag4)) as "14",
not len(Diag1) and len(Diag2) and (len(Diag3)) and not (len(Diag4)) as "23",
not len(Diag1) and len(Diag2) and not (len(Diag3)) and (len(Diag4)) as "24",
not len(Diag1) and not len(Diag2) and (len(Diag3)) and (len(Diag4)) as "34",
len(Diag1) and len(Diag2) and len(Diag3) and not (len(Diag4)) as "123",
len(Diag1) and len(Diag2) and not (len(Diag3)) and len(Diag4) as "124",
len(Diag1) and not (len(Diag2)) and len(Diag3) and len(Diag4) as "134",
not (len(Diag1)) and len(Diag2) and len(Diag3) and len(Diag4) as "234",
len(Diag1) and len(Diag2) and len(Diag3) and len(Diag4) as "1234"
RESIDENT RawDiagnosisTable;INNER JOIN (AllCombinations)
LOAD 1 as Exists AUTOGENERATE 1;
DROP FIELD Exists;LEFT JOIN
LOAD text(Combination_Num) as Combination,
Map
Resident Tmp_Mapeo;Drop table Tmp_Mapeo;

Counting instances
bryankoch Sep 23, 2010 3:40 PM (in response to Karl Pover )Figured my little issue out with the counts. I overlooked that I was using a Key, which was counting from the other table. :/ Otherwise, using
if(len(Diag1),'1')&if(len(Diag2),'2')&if(len(Diag3),'3')&if(len(Diag4),'4') as Combinations, in my load statement with
Load * Inline [Combination
12,
13,
14,
13,
14,
23,
24,
34,
123,
124,
134,
234,
1234
];Showed me what I needed to know.
John, thanks for all the help. I really appreciate it, and if sometime in the near future I need to see all possible combinations within the row I'll know how!
And Karl, I tried yours and it works as well.
Thanks


Counting instances
John Witherspoon Sep 23, 2010 3:36 PM (in response to bryankoch)
bryankoch wrote:What I don't understand, is how/why the combos with a count of 1 are reported that way, shouldn't they show as zeros?
Remove the inline table of possible combinations. I suspect that will fix the results. I think what's happening is that even when the combination doesn't exist in your diagnosis table, it's still counting the one occurrence in your inline table.












Counting instances
John Witherspoon Sep 22, 2010 5:49 PM (in response to bryankoch)Assuming you're working with your new tables instead of your raw table, maybe something like this to count how many with Diag1, Diag2 and Diag3?
sum(aggr(count({<ICDDes={'Diag1','Diag2','Diag3'}>} distinct ICDDes)=3,CodeNumber))
The set analysis says we're "selecting" the three diagnosis codes of interest. The aggr(...,CodeNumber) says we're making a "table" by CodeNumber. For each row in this table, we check to see if the number of distinct diagnosis codes is 3, which means that that "row" has all three diagnosis codes of interest. A "true" value is 1, so if we sum these up through the whole table, then flip the sign, we should have a count of all CodeNumbers that have these three diagnosis codes.