
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Extracting unique values for aggregation - patient data
patient_id | Episode Number | diagnosis | total_cost |
12a | 1x | blah1 | 1000 |
12a | 2y | blah2 | 1000 |
13b | 4z | blah1 | 3000 |
14c | 5x | blah1 | 2500 |
14c | 6z | blah2 | 2500 |
14c | 9x | blah3 | 2500 |
15d | 8y | blah1 | 1000 |
Hi there,
I hope I have posted this in the right place. If not let me know.
I have a large file and I have recreated the format in the example above.
Each patient_id is unique in the real world but each patient may have more than one episode number (also unique) and within each episode they may have more than one diagnosis (not unique). The presence of a few diagnosis/episodes for each patient means that the unique patient_ID and the total_costs for each patient appear more than once in the flat file.
I need to create a measure that extracts the total cost for each unique patient_id so that I can create an overall average total cost per patient as a KPI. I then need to use this measure in a chart with diagnosis as the dimension so i can then see the average total cost per patient associated with each diagnosis (or other fields I have in the real table).
So basically i need to sum the total cost for unique episode numbers and then sum these answers for the unique patient id's
When I just use avg(total_cost) - the results are way to high
Apologies in advance if there is something simple I am missing here but I am struggling with this. As a first step I've tried
sum(aggr(max([Episode Number]), total_cost)/count(aggr(max([Episode Number]),distinct [Episode Number]))) in an effort to start by pick out the unique episode ids using max and use aggr to create a temp table and then sum the total costs over this. I am then trying to divide by the number of unique Episode Numbers by counting them. This expression is not throwing an error but it is giving me a blank chart.
On top of this, I just cannot get to the next level and figure out how to work the unique patient_id's into this expression so the average is actually based on these.
Any help much appreciated
kindest regards,
Louise
- « Previous Replies
-
- 1
- 2
- Next Replies »


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Let me know if this is what you are looking for.
I have tried to get the avg by patient & Episode number with multiple methods.
Let me know if you are looking for different output.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
@I really appreciate the time you have taken to help with this. Unfortunately, I can't open your qlikview .qvw file as I am working in either QlikSense server or Qliksense desktop. I can see only the load script and the data model when i try to open it in qliksense. I can only open .qvf files.
If it is not too much trouble could you post the actual expressions you have compiled and I will try and replicate them?
kind regards,
Louise

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
how about
avg(
aggr(
sum(total_cost),
patient_id)
)
?


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Louise,
Maybe
Average total cost per patient as a KPI
avg(Aggr(sum(total_cost),patient_id)) = 3375
Average total cost per patient associated with each diagnosis
diagnosis | avg(Aggr(sum(total_cost),patient_id,diagnosis)) |
---|---|
blah1 | 1875 |
blah2 | 1750 |
blah3 | 2500 |
Andrew

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Marcus - I've had a look at that - I need to take the average total costs of the distinct episode numbers for each patient before I then take the average of those figures calculated on each distinct patient id. Which i think is a nested aggregation and not allowed?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Andrew,
Thank you for that, again much appreciated, but I have the same problem there, yes it looks like that is giving me an average of the total costs for each patient but before that calculation, I need to ensure that each patients total cost value is comprised of the total of his episode number costs.
In the real world - a patient might come to the hospital once - he has one episode that costs €1000 - that is the figure used in the sum part of the calculation of the average cost for a patient - simple enough.
However in some cases the patient will present at the hospital 2 or 3 times each time has a unique episode number and is associated with its own cost. These costs have to be summed for each patient to get their real total cost figure and it is this figure that is included in the 'average per patient' calculation. This is the bit I am missing prior to the average for each patient.
I am not too worried about the diagnosis as I can just pick that as a dimension on a chart once I have the expression for doing the above calculation sorted.
Apologies for melting peoples heads.
Louise

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To clarify - in the above example patient 14 c has 3 episode numbers (5x, 6z and 9x) each with an associated cost of (2500, 2500 and 2500 respectively - terrible example - should have made those numbers different), the costs for each of these has to be summed to get the total cost for patient 14c (which is 2500 +2500+2500 = 7500), that is the total cost used for patient 14 c when the average total cost per patient is being calculated.


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Louise,
Apologies if I'm not following you but do you want the average cost of diagnosis per patient?
=avg(aggr(Sum(total_cost),diagnosis,patient_id))
1928.5714
diagnosis | avg(Aggr(sum(total_cost),patient_id,diagnosis)) |
---|---|
1928.5714 | |
blah1 | 1875 |
blah2 | 1750 |
blah3 | 2500 |
Switching the dimension give the same overall average
patient_id | avg(Aggr(sum(total_cost),patient_id,diagnosis)) |
---|---|
1928.5714 | |
12a | 1000 |
15d | 1000 |
14c | 2500 |
13b | 3000 |
Cheers
Andrew

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you can provide the expected output for your above data, that will be helpful to give you exactly what you want

- « Previous Replies
-
- 1
- 2
- Next Replies »