Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Counting instances

Hi folks,

First let me explain what kind of data I'm working with.

The original data set has the following variables and setups

[Taxonomy Category], Diag1, [Diag1 Category], Diag2, [Diag2 Category], Diag3, [Diag3 Category], Diag4, [Diag4 Category]

All of the categories are text and have 20 to 50 different categories. All of the Diag (Diagnosis Codes) are codes, mostly numeric, some with a letter at the beginning (ie E909).

I have 2 versions of the qualified data (partly because I don't know how to get a combined version)

DiagnosisTable:
CROSSTABLE (ICD, ICDCategory, 2)
LOAD recno() as ID, [Taxonomy Category],
[Diag1 Cat],
[Diag2 Cat],
[Diag3 Cat],
[Diag4 Cat]
Resident RawDiagnosisTable;

DiagnosisCode:
Crosstable (ICDDes, ICDNumber, 2)
Load recno() as CodeNumber, [Taxonomy Category],
Diag1,
Diag2,
Diag3,
Diag4
Resident RawDiagnosisTable;

I want to explore the combinations of nonblank cells between the diagnosis codes by row.

For example, how many rows have values in Diag1 AND Diag2, how many with Diag1 and diag2 and diag3? etc

I know in Excel you can use a function called COUNTBLANK and a few if/then statements to achieve this effect with the original data structure.

In Qlikview I am not aware of a function that can do this, nor am I well acquainted with the syntax for expressions.

Can anyone help?

Thanks!

1 Solution

Accepted Solutions
johnw
Champion III
Champion III

Your suspicion is correct. That's what I was TRYING to do, and what made everything complicated. OK, just because we've come this far, to do what I was thinking with what your data really looks like, you just need to add an ID for the row, and to use the ID instead of the Taxonomy Category when building the extra table. I have that working just fine in an example with your data.

But I didn't realize you wanted a DISTINCT label for the row. That's why I have a separate table. I thought you WANTED to see all those combinations. If you only want a distinct label for the row, that's TRIVIAL by comparison.

if(len(Diag1),'1')&if(len(Diag2),'2')&if(len(Diag3),'3')&if(len(Diag4),'4') as Combination

Is that what you mean?

And actually, it looks like you ALWAYS fill in the codes from left to right? If so, then the ONLY possible combinations are these:

1
12
123
1234

NOT the ones we both listed before. And at that point, all you're doing is counting the number of diagnostic codes. You might as well just do this:

-rangesum(len(Diag1)>0,len(Diag2)>0,len(Diag3)>0,len(Diag4)>0) as NumberOfCodes

Here's every single one of those approaches. Maybe what you're asking for is one of them. Maybe not.

View solution in original post

16 Replies
pover
Luminary Alumni
Luminary Alumni

This would be the equivalent to COUNTBLANK in QlikView:

count(if(trim([Diag1])='' or isnull([Diag1]),[Diag1]))

because count(null()) = 0 and the if statement returns null if the condition is not met and you don't define a value when the condition is false.

so to count how many rows have values in Diag1 and Diag2 you could try

count(ICDNumber)

-

count(if(trim([Diag1]&[Diag2])='' or (isnull([Diag1]) and isnull([Diag2])),[Diag1]&[Diag2]))

Regards.

johnw
Champion III
Champion III

Assuming you're working with your new tables instead of your raw table, maybe something like this to count how many with Diag1, Diag2 and Diag3?

-sum(aggr(count({<ICDDes={'Diag1','Diag2','Diag3'}>} distinct ICDDes)=3,CodeNumber))

The set analysis says we're "selecting" the three diagnosis codes of interest. The aggr(...,CodeNumber) says we're making a "table" by CodeNumber. For each row in this table, we check to see if the number of distinct diagnosis codes is 3, which means that that "row" has all three diagnosis codes of interest. A "true" value is -1, so if we sum these up through the whole table, then flip the sign, we should have a count of all CodeNumbers that have these three diagnosis codes.

Not applicable
Author

Hmm

What kind of dimension would you recommend I set up with that?

johnw
Champion III
Champion III

I'm sorry; I don't understand the question. If you want a count of the number of rows with those three codes, that's just a single number. There's no dimension implied. You can have a dimension if you want. I just wouldn't have any idea what dimension you might want.

Not applicable
Author

well at some point I would like to have them grouped.

I think for now, selecting which chart to show this in is the better question for me. Because I cannot get any output to display for either answer.

Not applicable
Author

I guess to better explain, the output I am looking for is

Combo Count

1-2

1-3

1-4

2-3

3-4

1-2-3

2-3-4

... and so forth

And somewhere down the line be able throw a sort to where I can look at persay all the combos in a specific Taxonomy or all the combos for a specific Diagnosis Category.

johnw
Champion III
Champion III

Ah, so you want to see every POSSIBLE combination of at least two diagnostic codes? All of these? As values for a dimension?

12
13
14
23
24
34
123
124
134
234
1234

I don't know of any good way to make QlikView iterate over all the combinations of a dimension's values as a dimension. Below is pure brute force, but it might work, particularly since the number of combinations is small enough to get away with this:

AllCombinations:
CROSSTABLE (Combination,Exists)
LOAD
"Taxonomy Category"
,len(Diag1) and len(Diag2) as "12"
,len(Diag1) and len(Diag3) as "13"
...
,len(Diag1) and len(Diag2) and len(Diag3) and len(Diag4) as "1234"
RESIDENT RawDiagnosisTable
;
INNER JOIN (AllCombinations)
LOAD -1 as Exists
AUTOGENERATE 1
;
DROP FIELD Exists
;

At this point, AllCombinations should be a linkage table between the Taxonomy Category and all existing combinations of your diagnosis codes. At that point, you should be able to do this:

Dimension = Combination
Expression = count(distinct "Taxonomy Category")

All of this is very untested. If it doesn't work like I wrote it and you can't fix it, maybe upload a file with some sample data for me to fiddle with.

Not applicable
Author

Well, creating the dimensions worked perfectly:

AllCombinations:
Replace
CROSSTABLE (Combination,Exists)
LOAD "Taxonomy Category",
len(Diag1) and len(Diag2) as "12",
len(Diag1) and len(Diag3) as "13",
len(Diag1) and len(Diag4) as "14",
len(Diag2) and len(Diag3) as "23",
len(Diag2) and len(Diag4) as "24",
len(Diag3) and len(Diag4) as "34",
len(Diag1) and len(Diag2) and len(Diag3) as "123",
len(Diag1) and len(Diag2) and len(Diag4) as "124",
len(Diag1) and len(Diag3) and len(Diag4) as "134",
len(Diag2) and len(Diag3) and len(Diag4) as "234",
len(Diag1) and len(Diag2) and len(Diag3) and len(Diag4) as "1234"
RESIDENT RawDiagnosisTable;

INNER JOIN (AllCombinations)
LOAD -1 as Exists AUTOGENERATE 1;
DROP FIELD Exists;

However, it's not counting correctly.

When I use distinct it returns 6 (the number of taxonomies I have currently loaded in the file) and when I remove it, enters in the total in every entry for the dimension. I thought perhaps it was because that Taxonomy is now a "key" value and wasn't counting the correct field, but turns out that wasn't the case I don't think.

I did attach the excel file (a much smaller version) but there's 100 rows from 3 different taxonomies.

Not applicable
Author

Disregard that previous sample, use this one instead: