Extracting unique values for aggregation - patient... - Qlik Community

betthisisnttake · ‎2017-07-24

patient_id	Episode Number	diagnosis	total_cost
12a	1x	blah1	1000
12a	2y	blah2	1000
13b	4z	blah1	3000
14c	5x	blah1	2500
14c	6z	blah2	2500
14c	9x	blah3	2500
15d	8y	blah1	1000

Hi there,

I hope I have posted this in the right place. If not let me know.

I have a large file and I have recreated the format in the example above.

Each patient_id is unique in the real world but each patient may have more than one episode number (also unique) and within each episode they may have more than one diagnosis (not unique). The presence of a few diagnosis/episodes for each patient means that the unique patient_ID and the total_costs for each patient appear more than once in the flat file.

I need to create a measure that extracts the total cost for each unique patient_id so that I can create an overall average total cost per patient as a KPI. I then need to use this measure in a chart with diagnosis as the dimension so i can then see the average total cost per patient associated with each diagnosis (or other fields I have in the real table).

So basically i need to sum the total cost for unique episode numbers and then sum these answers for the unique patient id's

When I just use avg(total_cost) - the results are way to high

Apologies in advance if there is something simple I am missing here but I am struggling with this. As a first step I've tried

sum(aggr(max([Episode Number]), total_cost)/count(aggr(max([Episode Number]),distinct [Episode Number]))) in an effort to start by pick out the unique episode ids using max and use aggr to create a temp table and then sum the total costs over this. I am then trying to divide by the number of unique Episode Numbers by counting them. This expression is not throwing an error but it is giving me a blank chart.

On top of this, I just cannot get to the next level and figure out how to work the unique patient_id's into this expression so the average is actually based on these.

Any help much appreciated

kindest regards,

Louise

neelamsaroha157 · ‎2017-07-24

Let me know if this is what you are looking for.

I have tried to get the avg by patient & Episode number with multiple methods.

Let me know if you are looking for different output.

betthisisnttake · ‎2017-07-25

Hi,

@I really appreciate the time you have taken to help with this. Unfortunately, I can't open your qlikview .qvw file as I am working in either QlikSense server or Qliksense desktop. I can see only the load script and the data model when i try to open it in qliksense. I can only open .qvf files.

If it is not too much trouble could you post the actual expressions you have compiled and I will try and replicate them?

kind regards,

Louise

marcus_malinow · ‎2017-07-25

how about

avg(

aggr(

sum(total_cost),

patient_id)

)

?

effinty2112 · ‎2017-07-25

Hi Louise,

Maybe

Average total cost per patient as a KPI

avg(Aggr(sum(total_cost),patient_id)) = 3375

Average total cost per patient associated with each diagnosis

diagnosis	avg(Aggr(sum(total_cost),patient_id,diagnosis))
blah1	1875
blah2	1750
blah3	2500

Andrew

betthisisnttake · ‎2017-07-25

Thanks Marcus - I've had a look at that - I need to take the average total costs of the distinct episode numbers for each patient before I then take the average of those figures calculated on each distinct patient id. Which i think is a nested aggregation and not allowed?

betthisisnttake · ‎2017-07-25

Hi Andrew,

Thank you for that, again much appreciated, but I have the same problem there, yes it looks like that is giving me an average of the total costs for each patient but before that calculation, I need to ensure that each patients total cost value is comprised of the total of his episode number costs.

In the real world - a patient might come to the hospital once - he has one episode that costs €1000 - that is the figure used in the sum part of the calculation of the average cost for a patient - simple enough.

However in some cases the patient will present at the hospital 2 or 3 times each time has a unique episode number and is associated with its own cost. These costs have to be summed for each patient to get their real total cost figure and it is this figure that is included in the 'average per patient' calculation. This is the bit I am missing prior to the average for each patient.

I am not too worried about the diagnosis as I can just pick that as a dimension on a chart once I have the expression for doing the above calculation sorted.

Apologies for melting peoples heads.

Louise

betthisisnttake · ‎2017-07-25

To clarify - in the above example patient 14 c has 3 episode numbers (5x, 6z and 9x) each with an associated cost of (2500, 2500 and 2500 respectively - terrible example - should have made those numbers different), the costs for each of these has to be summed to get the total cost for patient 14c (which is 2500 +2500+2500 = 7500), that is the total cost used for patient 14 c when the average total cost per patient is being calculated.

effinty2112 · ‎2017-07-25

Hi Louise,

Apologies if I'm not following you but do you want the average cost of diagnosis per patient?

=avg(aggr(Sum(total_cost),diagnosis,patient_id))

1928.5714

diagnosis	avg(Aggr(sum(total_cost),patient_id,diagnosis))
	1928.5714
blah1	1875
blah2	1750
blah3	2500

Switching the dimension give the same overall average

patient_id	avg(Aggr(sum(total_cost),patient_id,diagnosis))
	1928.5714
12a	1000
15d	1000
14c	2500
13b	3000

Cheers

Andrew

Kushal_Chawda · ‎2017-07-25

If you can provide the expected output for your above data, that will be helpful to give you exactly what you want

Extracting unique values for aggregation - patient data