Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
hic
Former Employee
Former Employee

 

If you use equality as a condition when comparing floats, I will flunk you!

I can still hear the words of the Professor in my first programming class when studying for my engineering degree. The threat was very real – he meant it – and the reason was of course the fact that you cannot (always) represent decimal numbers in an exact binary form.

For example, we would never dream of writing a condition

  If( x = 0.3333333 , … )

when we want to test if x equals a third. Never. Because we know that a third cannot be represented exactly as a decimal number. No matter how many threes we add to the number, it will still not be exact.

But it is not uncommon that people make comparisons with an exact decimal number, similar to

      If( x = 0.01 , … )

thinking that it is a valid comparison, although it leads to exactly the same problem as the previous comparison! This becomes obvious if you look at the hexadecimal representation of 0.01:

      0.01 (decimal) = 0.028F5C28F5C28F…. (hex)

The sequence …28F5C… is repeated an infinite number of times, but since QlikView uses a finite number of binary digits (all according to the IEEE standard), QlikView will internally use a “rounded” number.

So what are the consequences? Well, QlikView will sometimes deliver the “wrong” number as result. Examples:

Ceil( 0.15, 0.01 ) will return 0.16
Floor( 0.34, 0.01 ) will return 0.33
0.175*1000 = 175 will return FALSE
Time( Floor( Time#( '04:00:00' ),1/24/60/60 )) will return 03:59:59

What you see are not errors in QlikView. And they are not errors in IEEE 754. Rather, they represent errors in the expectation and usage of binary floating point numbers. Once you understand what binary floating point numbers really are, it makes perfect sense. It's simply that some values cannot be exactly represented as binary numbers, so you get rounding errors. There's no way around it.

Should you want to investigate this yourself, I suggest you start with the following script that generates 100 numbers and their rounded counterparts. In five cases the Ceil() function rounds "incorrectly" and generates a "Diff" different from zero:

Load
   Num(Rounded,'(HEX) 0.000000000000000','.',' ') as RoundedHEX,
   (Round(100*Rounded) - PartsPer100)/100 as Diff,
   *;
Load
   Ceil(PartsPer100/100, 0.01) as Rounded,
   *;
Load
   RecNo() as PartsPer100
   Autogenerate 100 ;

So, what should you do?

First of all, you should realize that the rounding errors are small and usually insignificant. In most cases they will not affect the result of the analysis.

Further, you could avoid rounding with Floor() and Ceil() to sub-integer fractions.

Also, you could convert the numbers to integers, because the errors will only appear if the numbers can have sub-integer components. For instance, if you know that you always deal with dollars and cents, you could convert the numbers to (integer) cents:

  Round( 100*Amount ) as Cents

Or if you know that you never deal with time units smaller than seconds:

  Round( 24*60*60*Time#( Time, 'hh:mm:ss' ) ) as Seconds

And finally, you should never use equality as a condition when comparing floats. Use greater than or less than. My professor isn’t here to flunk you, but rest assured: In his absence, QlikView will do it for him.

HIC

39 Comments
beck_bakytbek
Master
Master

Henric Thanks a lot for your Help

i will implement it

thanks a lot

Beck

0 Likes
548 Views
tschullo
Creator III
Creator III

Somebody needs to explain this to my 5 dollar calculator.... it keeps getting the right answer with no rounding "errors"!

0 Likes
548 Views
hic
Former Employee
Former Employee

If you use the decimal64 format to store your numbers, you will get exact calculations. But if you use the binary64 format, you will get rounding errors like the ones I describe. See Double-precision floating-point format - Wikipedia

So why do we use binary64 and not decimal64?

The main reason is that decimal64 uses more memory and more cpu than binary64. And if you want to handle hundreds of millions of records, you will have a significant performance penalty for decimal64. But for a 5 dollar calculator, the performance penalty is so small so it doesn't matter. Most likely, the calculator uses a decimal based format.

But there is also an other reason: There is an abundance of software libraries for binary64, e.g. statistical ones. We would not be able to use these if we chose decimal64.

HIC

548 Views
tschullo
Creator III
Creator III

Thanks Henric! My calculator feels better now!

0 Likes
548 Views
Not applicable

Interesting post. Explains it very well.

Having said that it doesn't make it right though. You mention statistical software libraries as one reason you are using binary64. I use R a lot and R is fully capable of handling large datasets and return decimal numbers accurately. ! tried to create a histogram for rather small numbers (all between -0.05 and 1.2) with a binwidth of 0.05. Got completely wrong frequencies. Ok it works if I multiply all values with 100 and subsequently divide by 100 again which is fine.

Having said that , this approach doesn't work for using class which returns a string representation of the bins. Any suggestions?

In the image below the left one is created using class (wrong bins) the right one is used creating round function. Also note that the right one is using a step size of 0.05 which results in 0 not being shown properly.

Its a shame

histograms.PNG

548 Views
Anonymous
Not applicable

Thanks for Sharing

Tahemas Momin

0 Likes
548 Views
Oleg_Troyansky
Partner Ambassador/MVP
Partner Ambassador/MVP

Thanks Henric! Very educational!

0 Likes
728 Views
m_perreault
Creator III
Creator III

@hic 

Can you explain why in Qlik (1.00000000000001 + 1) = 2 

It seems strange to me that when I remove the + 1 the value is 1.00000000000001 however when I add 1 its 2 instead of 2.00000000000001

Like wise if I remove one decimal place from my above equation

(1.0000000000001 + 1)  =  2.0000000000001

 

Thanks,
Mark

0 Likes
726 Views
robert99
Specialist III
Specialist III
0 Likes
520 Views