Skip to main content
Announcements
Qlik Introduces a New Era of Visualization! READ ALL ABOUT IT
cancel
Showing results for 
Search instead for 
Did you mean: 
MrBosch
Creator
Creator

How can I obtain a value by interpolation from a large dataset

Hi all,

I am trying to think of a way to interpolate and calculate a value.
This is the scenario: a person in a certain age, has an expected minimum and maximum hearing loss per frequency.

I know for 7 frequencies (Hz) what the expected min/max loss is (in dB) per biogender (m/f) and per age.

In my dashboard I do already have values for the variables 'var_Gender', 'var_Age' and (for each ear (left/right), score per frequency) 'var_Left_500/var_Left_1000' and so on.

So now I 'only' need Qlikview to 'learn' the dataset (some smart LOAD INLINE probably).

This is how it looks in Excel for only the category 20 and 30 years of age (but there are many more (up to 70 years)):

MrBosch_0-1693333658171.png

How would I be able to store all these numbers per frequency, age and gender, so when I select one person, it can calculate the interpolated (if required... when the age is in between the 20 and 30 in this example) Best and Worst value?

So if our subject is a 25 year old female and we look at 3000Hz, the expected outcome would be 0,2 as best and 5,7 as worst loss.

Anyone an idea?

I would like to thank you in advance already for even reading this and thinking about it 🙂

Alex

Labels (1)
2 Solutions

Accepted Solutions
justISO
Specialist
Specialist

Hi, I'll provide my toughs how I would try to solve this. First, I would merge all possible values into one table, so you have
Gender |Age |Hz |Best |Worst
Male |20 |500 |0 |5,5
Male |20 |1000 |0 |6
...
Female |70 |6000 |? |?
Female |70 |8000 |? |?
Then, as you have your subjects gender/age/frequency you can 'look' (left join in load script, or some aggregation in report level) between lower and higher ranges you know.
In your case, your subject female age is 25. Floor(subject_age/10)*10 will give you age 20 (lower known range) and ceil(subject_age/10)*10 will give you age 30 (higher known range). From your merged main table you take possible values for female/3000hz for ages corresponding lower/higher (floor/ceil) age ranges. So you have something like this:
Gender |subject_age |Hz |Age_low |Best_low |Worst_low |Age_high |Best_high |Worst_high
Female |25 |3000 |20 |0 |5,2 |30 |0,4 |6,2
And now you just calculate Best/Worst values according subject_age, something like:
(Best_high-Best_low)/(10*(subject_age-Age_low))+Best_low
(Worst_high-Worst_low)/(10*(subject_age-Age_low))+Worst_low

View solution in original post

marcus_sommer

I think I would simply create a record for each year and KEY and using a mapping-approach to interpolate the values.

The mapping would be quite simple, like:

m: mapping load Age & Gender & Frequency, Best & '|' & Worst from Source;

At next would the age offset between the ages added respectively included - if they are always 10 years this step isn't necessary - and this might be done with another mapping or maybe a pick(match()) and then the final population-load with interpolation of the values, with something like this:

load *, Value + ((Best - Value) / applymap('mapOffset', Age, 0) * IterNo) as newValue;
load *, Age + IterNo - 1 as newAge, subfield(BestWorst, '|', 1) as Best, ...;
load *, applymap('m',  Age & Gender & Frequency, null()) as BestWorst, iterno() as IterNo
from Source while iterno() - 1 <= applymap('mapOffset', Age, 0);

You may need a bit adjustments, for example to set the iterno() right - as iterator and to add the age and to calculate the interpolation value - but in general this approach should work.

View solution in original post

3 Replies
justISO
Specialist
Specialist

Hi, I'll provide my toughs how I would try to solve this. First, I would merge all possible values into one table, so you have
Gender |Age |Hz |Best |Worst
Male |20 |500 |0 |5,5
Male |20 |1000 |0 |6
...
Female |70 |6000 |? |?
Female |70 |8000 |? |?
Then, as you have your subjects gender/age/frequency you can 'look' (left join in load script, or some aggregation in report level) between lower and higher ranges you know.
In your case, your subject female age is 25. Floor(subject_age/10)*10 will give you age 20 (lower known range) and ceil(subject_age/10)*10 will give you age 30 (higher known range). From your merged main table you take possible values for female/3000hz for ages corresponding lower/higher (floor/ceil) age ranges. So you have something like this:
Gender |subject_age |Hz |Age_low |Best_low |Worst_low |Age_high |Best_high |Worst_high
Female |25 |3000 |20 |0 |5,2 |30 |0,4 |6,2
And now you just calculate Best/Worst values according subject_age, something like:
(Best_high-Best_low)/(10*(subject_age-Age_low))+Best_low
(Worst_high-Worst_low)/(10*(subject_age-Age_low))+Worst_low

marcus_sommer

I think I would simply create a record for each year and KEY and using a mapping-approach to interpolate the values.

The mapping would be quite simple, like:

m: mapping load Age & Gender & Frequency, Best & '|' & Worst from Source;

At next would the age offset between the ages added respectively included - if they are always 10 years this step isn't necessary - and this might be done with another mapping or maybe a pick(match()) and then the final population-load with interpolation of the values, with something like this:

load *, Value + ((Best - Value) / applymap('mapOffset', Age, 0) * IterNo) as newValue;
load *, Age + IterNo - 1 as newAge, subfield(BestWorst, '|', 1) as Best, ...;
load *, applymap('m',  Age & Gender & Frequency, null()) as BestWorst, iterno() as IterNo
from Source while iterno() - 1 <= applymap('mapOffset', Age, 0);

You may need a bit adjustments, for example to set the iterno() right - as iterator and to add the age and to calculate the interpolation value - but in general this approach should work.

MrBosch
Creator
Creator
Author

Thank you both for your answers. I believe (I am not the expert) they can both work!

Great thing is that I completely understand JustISO's approach and even came up with another possibility:
I will create a 'flat' table in Excel, doing al the interpolation just there and create a min|max for each freq and gender.
I can import this or... (perhaps) ask ChatGPT to define a function based upon all these X,Y values to create the 'best effort' F(x){gender|age}: that then can calculate the Min or Max for each situation!

Again thank you both. I will have a great weekend reaching the goal by spending some quality time with Qlikview! 🙂