All about product and Qlik solutions: scripting, data modeling, visual design, extensions, best practices, etc.
“If you use equality as a condition when comparing floats, I will flunk you!”
I can still hear the words of the Professor in my first programming class when studying for my engineering degree. The threat was very real – he meant it – and the reason was of course the fact that you cannot (always) represent decimal numbers in an exact binary form.
For example, we would never dream of writing a condition
If( x = 0.3333333 , … )
when we want to test if x equals a third. Never. Because we know that a third cannot be represented exactly as a decimal number. No matter how many threes we add to the number, it will still not be exact.
But it is not uncommon that people make comparisons with an exact decimal number, similar to
If( x = 0.01 , … )
thinking that it is a valid comparison, although it leads to exactly the same problem as the previous comparison! This becomes obvious if you look at the hexadecimal representation of 0.01:
0.01 (decimal) = 0.028F5C28F5C28F…. (hex)
The sequence …28F5C… is repeated an infinite number of times, but since QlikView uses a finite number of binary digits (all according to the IEEE standard), QlikView will internally use a “rounded” number.
So what are the consequences? Well, QlikView will sometimes deliver the “wrong” number as result. Examples:
Ceil( 0.15, 0.01 ) will return 0.16
Floor( 0.34, 0.01 ) will return 0.33
0.175*1000 = 175 will return FALSE
Time( Floor( Time#( '04:00:00' ),1/24/60/60 )) will return 03:59:59
What you see are not errors in QlikView. And they are not errors in IEEE 754. Rather, they represent errors in the expectation and usage of binary floating point numbers. Once you understand what binary floating point numbers really are, it makes perfect sense. It's simply that some values cannot be exactly represented as binary numbers, so you get rounding errors. There's no way around it.
Should you want to investigate this yourself, I suggest you start with the following script that generates 100 numbers and their rounded counterparts. In five cases the Ceil() function rounds "incorrectly" and generates a "Diff" different from zero:
Load
Num(Rounded,'(HEX) 0.000000000000000','.',' ') as RoundedHEX,
(Round(100*Rounded) - PartsPer100)/100 as Diff,
*;
Load
Ceil(PartsPer100/100, 0.01) as Rounded,
*;
Load
RecNo() as PartsPer100
Autogenerate 100 ;
So, what should you do?
First of all, you should realize that the rounding errors are small and usually insignificant. In most cases they will not affect the result of the analysis.
Further, you could avoid rounding with Floor() and Ceil() to sub-integer fractions.
Also, you could convert the numbers to integers, because the errors will only appear if the numbers can have sub-integer components. For instance, if you know that you always deal with dollars and cents, you could convert the numbers to (integer) cents:
Round( 100*Amount ) as Cents
Or if you know that you never deal with time units smaller than seconds:
Round( 24*60*60*Time#( Time, 'hh:mm:ss' ) ) as Seconds
And finally, you should never use equality as a condition when comparing floats. Use greater than or less than. My professor isn’t here to flunk you, but rest assured: In his absence, QlikView will do it for him.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.