Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
morenoju
Partner - Specialist
Partner - Specialist

Issue with nulls when using sum functions to calculate an average

Hi all,

I'm using this function to calculate a speed average:

sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length*speed_mph)
/
sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length)
/
count(Date_A)

My roads are divided in segments ("links") of different lengths. I decided to use the sum function because I want to calculate a weighted average. Example: if for 8 miles the speed was 50 and for 2 miles it was 70, the average should not be 60, but 54.

Users can pick more than one date to calculate these average, therefore the / count (Date_A) at the end.

If I pick just one date and there are missing values (nulls) for some of the hours, it is not a problem, I simply get a gap in the chart:

 image.png

 

However, if I pick two or more dates, the chart makes an average considering the null values as zero.

That's why picking 10/11 and 10/12, at 4:30 pm where the value should be 24 mph (the only value available is the 10/12 4:30 pm record which is 24) the graph is showing 24 + 0 / 2 = 12.

image.png

Do you have any idea on what to do so Qlik interprets the null values correctly?

I tried to use:

sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length*speed_mph)
/
sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length)
/
count({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length*speed_mph)

That works when I have only one "link" (with one length) selected, but the numbers don't match for selection of multiple links.

Am I using a wrong approach to the problem?

Thanks much!

Labels (4)
1 Solution

Accepted Solutions
morenoju
Partner - Specialist
Partner - Specialist
Author

Hi Sunny, I have link_date in the x-axis and then Date_A, Date_B and Date_C to represent three different group of days at the same time for comparisons.

I think the key is using an aggregation. I'm now calculating first the weighted averages using the sum (length*speed) / sum (length) and then doing an additional average on link_date and link_time_15min:

avg(aggr(sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length*speed_mph)/sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length),link_date,link_time_15min))

I think this gives the correct result even for those dates with nulls. I'll test further and post an update later.

Thanks!

View solution in original post

6 Replies
sunny_talwar

How about this

sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length*speed_mph)
/
sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length)
/
count({<length *= {"*"}, speed_mph *= {"*"}>}Date_A)

morenoju
Partner - Specialist
Partner - Specialist
Author

Thanks Sunny, but still same result as dividing by count (Date_A).

I'm thinking of how I calculate this things manually: I first work with individual links. I calculate the average of the speed_mph for 4:00, 4:15, 4:30, etc. for the selected days.

Once I have the result, I do the weighted average:

((avg_speed1*length1)+(avg_speed2*length2))/(length1 + length2)

I'm looking at how to do that in Qlik, maybe with aggr, but so far no luck.

sunny_talwar

Why are you counting Date_A here and not link_date here?

morenoju
Partner - Specialist
Partner - Specialist
Author

Hi Sunny, I have link_date in the x-axis and then Date_A, Date_B and Date_C to represent three different group of days at the same time for comparisons.

I think the key is using an aggregation. I'm now calculating first the weighted averages using the sum (length*speed) / sum (length) and then doing an additional average on link_date and link_time_15min:

avg(aggr(sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length*speed_mph)/sum({<start_timestamp=,link_date = p(Date_A),link_hour=p(hour_A)>}length),link_date,link_time_15min))

I think this gives the correct result even for those dates with nulls. I'll test further and post an update later.

Thanks!

sunny_talwar

Sounds good

morenoju
Partner - Specialist
Partner - Specialist
Author

I tested it and the new expression handles the nulls correctly!

Regards