Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm doing a straight extra (LOAD * SQL SELECT * FROM TABLE) from Oracle via OLEDB and when I add the number up in QlikVIew I get a different answer than in Oracle using a straight select sum from table. Over about 185K records I'm off by 9 cents. Any ideas on how to correct this.
I gather than in Oracle, the field is a decimal number like 123.45 for one hundred twenty three dollars and forty five cents?
Try this. As you read in the number, multiply it by 100 in the SQL to turn it to cents. Then wherever you're displaying it in QlikView, divide by 100 again to turn it back into dollars.
If that resolves the problem, I'll explain what's gone wrong.
I gather than in Oracle, the field is a decimal number like 123.45 for one hundred twenty three dollars and forty five cents?
Try this. As you read in the number, multiply it by 100 in the SQL to turn it to cents. Then wherever you're displaying it in QlikView, divide by 100 again to turn it back into dollars.
If that resolves the problem, I'll explain what's gone wrong.
That fixed the problem. Thanks. I look forward to hearing why this happening.
Although they rarely surface and the magnitude is typically small, QlikView is subject to certain kinds of mathematical errors. Here are some examples:
Expression QlikView Correct
ceil ( 0.28 , 0.01 ) 0.29 0.28
floor ( 29/100 , 0.01 ) 0.28 0.29
round ( -3.5/100 , 0.01 ) -0.04 -0.03
round ( 14.5/100 , 0.01 ) 0.14 0.15
round ( 50.55 , 0.1 ) 50.5 50.6
if ( 0.175*1000 = 175 , 'equal' , 'not equal' ) not equal equal
if ( 0.29 = 29/100 , 'equal' , 'not equal' ) not equal equal
In the case of rounding, one explanation considered was that rounding is not a mathematical absolute, but rather a convention that varies from country to country, and industry to industry. However, QlikView's documentation specifies the behavior explicitly:
"round(x[,step[,offset]])
Rounding of x upwards or downwards n number of steps with an offset of offset. The result is a number. If x is exactly in the middle of an interval, it is rounded upwards."
This is known as asymmetric arithmetic rounding. And asymmetric arithmetic rounding should produce the results in my "Correct", column, not the actual results.
What's actually happening is that QlikView is storing numbers internally in binary. That works just fine for the integer portion of numbers, but can cause problems for the fractional part. Many fractions that can be easily expressed in base 10 cannot be easily expressed in base 2.
For example, let's take the simple fraction 29/100. That is easily expressed in base 10 as 0.29. But in base 2, it is this:
0.0100101000111101011100
And then all but the first two digits after the decimal repeat. So QlikView must either truncate or round off this value at some number of digits. I would guess it is rounding off, which in this case is a simple operation of if the digit past yours is 1, round up, else truncate.
Depending on the number of digits QlikView stores, we get a number that is either less than or greater than 0.29, but never exactly equal. Therefore, the floor(), ceiling() and round() functions won't always work as expected on fractional numbers.
Mind you, there's nothing strictly wrong with binary representations of numbers. Decimal representations have the same exact problems, but we're all used to them. Take a very simple expression like 1/3. We all know that this is 0.33333333 and on forever. We know that computers have a limited number of digits, so might only store eight 3s, for example. If we then multiply by 3, we won't get 1. We'll get 0.99999999. So if I write the expression if((1/3)*3=1,'equal','not equal'), I probably wouldn't be surprised to get not equal, and wouldn't consider it a bug. So it's just as big of a problem as the problems with binary numbers, except that we're used to it.
This sort of error should only affect fractions. Integers will be stored as exact values. So that gives us a workaround to the problem - make sure all of your data is in an integer format by the time it gets to QlikView. For instance, let's say you're reading data out of an SQL database from an accounting system, with amounts like $57.23 US dollars:
LOAD
ID
,AMOUNT as "Amount in Dollars"
;
SQL SELECT
ID
,AMOUNT
FROM MY_ACCOUNTING_TABLE
;
We "know" that QlikView may not be able to store these amounts correctly due to its internal numeric format. So multiply them by 100 before QlikView ever sees them:
LOAD
ID
,AMOUNT as "Amount in Cents"
;
SQL SELECT
ID
,AMOUNT * 100 as AMOUNT
FROM MY_ACCOUNTING_TABLE
;
Now QlikView can store the EXACT number. To display it in dollars, simply divide by 100 at display time. It will display correctly, and the fact that QlikView doesn't have the exact internal format for it won't matter. Also be careful about how you calculate. If you divide by 100, then multiply by 100, you may not get the right result. If you first multiply by 100, then divide by 100, I think you would get the right result.
I've run into all of this before. I designed a compiler and specified that the language should use floating point binary as its internal format. I should have realized that it would lead to these sorts of rounding problems, but I did not. Fortunately, it was fairly easy to convert to a decimal format once we noticed the problems.
I doubt that QlikView would be so easy to fix. Using binary may be the only thing keeping applications small and fast enough to run on the type of hardware typically used for QlikView. In that case, we'd be facing a trade off of calculating numbers correctly, but forcing everyone to move to faster 64-bit machines with lots of RAM, or just leaving the calculations incorrect. I'm guessing that QlikView will stick with incorrect calculations for the foreseeable future.
Personally, we run on faster 64-bit machines with lots of RAM, so I'd prefer correct calculations. But I won't be so selfish as to insist that QlikTech should cater to my needs instead of the majority who probably need high speed and small size more than they need perfectly accurate mathematics 100% of the time.
QlikTech themselves have given no indication to me of their intended future plans, nor did I ask them about them. All I did was report this as a bug with even more documentation than I've included here, and they confirmed that they are indeed using binary floating point as their internal format. They specified the exact format as well, and while I can't seem to locate the email, I believe it was IEEE double-precision floating point.
Wow! I read most of that and can safely say I understand some of it.
I don't know that I agree that speed and small size have anything on accuracy though. I'm willing to bet that if I asked users to choose between speed, accuracy and small size, they'd choose accuracy every time.
Your explanation makes me wonder if I haven't run into a similar problem in the past. I recall a few frustrating episodes where I could not figure out why a number wasn't equal to the same number.
John,
Good post. I guess you've been through some troubles with all this numbers. I had a few issues sometimes, and simply used x100 and /100 to solve the problem without going into the theory why it worked this way.
Would you mind to post this on Wiki?
Of course users would prefer a more consistently-accurate approach (extended-precision decimal floating point, for example). They don't have to take into account the cost of hardware. Looking at the bigger picture, though, is it worth spending $10,000 on a bigger server to avoid a $0.09 cent error on a report? An error that can be prevented with a little extra programming effort if desired?
In our shop, the answer is probably yes, since we already have the bigger server. That may well be true where you work as well.
But QlikTech has a company to run, and is in the business of making sales. An extended-precision decimal floating point might well cost enough additional memory and processing power as to REQUIRE a 64-bit server for most applications. For the sake of argument, going in that direction from the very beginning might have meant they'd have 50% fewer customers right now. Compare that to the number of customers who have noticed any accuracy problems, and the number of those customers for whom those problems are serious, and worth getting rid of QlikView as a result.
So I understand the decision. I don't like it, but I understand it.
Basically, don't use QlikView as a robust accounting system. For most other purposes, it's probably usually good enough.
Michael Solomovich wrote:
John,
Good post. I guess you've been through some troubles with all this numbers. I had a few issues sometimes, and simply used x100 and /100 to solve the problem without going into the theory why it worked this way.
Would you mind to post this on Wiki?<div></div>
Thanks. This hasn't actually caused me any trouble - none of my applications currently require this sort of accounting rigor. I just noticed a number of examples of this on the forums and decided to track it down because I don't like it when my software doesn't work correctly. I also suspected that I knew the reason why it wasn't working based on my earlier compiler experience. When I reported the bug, it wasn't something I expected would be fixed, but I wanted to make sure I understood the problem so that I could look out for it and work around it.
Added to the Wiki.