Floating Point Demystified, Part 2: Why Doesn't 0.1 + 0.2 == 0.3?
This is the second article in a series. My previous entry Floating Point Demystified, Part 1 was pretty dense with background information. For part 2 let’s answer a burning, practical question that bites almost every new programmer at some point:
Why oh why doesn’t 0.1 + 0.2 == 0.3
?
The answer is: it does! In mathematics. But floating point has
failed at this before we even get to the addition part.
Double-precision floating point is totally incapable of representing
0.1
, 0.2
, or 0.3
. When you think you’re adding those
numbers in double-precision, here is what you are actually adding
0.1000000000000000055511151231257827021181583404541015625
+ 0.200000000000000011102230246251565404236316680908203125
-----------------------------------------------------------
0.3000000000000000444089209850062616169452667236328125
But if you just type in 0.3
directly, what you’re getting is:
0.299999999999999988897769753748434595763683319091796875
Since those last two numbers aren’t the same, the equality comparison returns false.
Some of you reading this probably won’t believe me. “You’ve just
printed a bunch of decimal places, but what you have is still
an approximation, just like 0.1
and 0.2
are!” I can’t blame
you for your distrust. Computer systems have traditionally made
it extraordinarily difficult to see the precise value of a
floating-point number. Anything you’ve seen printed out before
probably was an approximation. You may have even lost faith
that floating point values even have an exact value that can
be printed. You might think that their true value is an
infinitely repeating decimal like \(0.\bar{1}\). Or maybe
it’s an irrational number like \(\sqrt 2\) whose decimal
expansion never repeats or terminates.
The truth is that floating-point numbers are rational and can always
have their exact value printed out in a finite decimal. The numbers
above are absolutely precise renderings of the double’s true value!
You can try it yourself by using the built-in decimal
module in
Python, which supports arbitrary precision decimal numbers:
$ python
>>> from decimal import Decimal, getcontext
>>> getcontext().prec = 1000 # To prevent truncation/rounding
>>> Decimal(0.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')
Of course for single-precision, the precise number would be
different. 0.1
in single precision is a little shorter:
0.100000001490116119384765625
Then Why Does The Computer Print 0.1?
The reason everyone gets so confused to begin with is that
basically every programming language will natively print 0.1
instead of the double’s true value:
$ python
Python 2.7.10 (default, Oct 23 2015, 18:05:06)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
>>> 0.1
0.1
$ node
> 0.1
0.1
$ irb
irb(main):001:0> 0.1
=> 0.1
$ lua
Lua 5.3.2 Copyright (C) 1994-2015 Lua.org, PUC-Rio
> print(0.1)
0.1
Why did four languages in a row “lie” to me about the value of 0.1?
Things get a little interesting here. While all four languages printed the same approximation here, they did it for two totally different reasons. In other words, they got to their answer in two totally different ways.
You can see the difference between them if you try to print a slightly different value:
$ python
Python 2.7.10 (default, Oct 23 2015, 18:05:06)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
>>> 0.1 + 0.2
0.30000000000000004
$ node
> 0.1 + 0.2
0.30000000000000004
$ irb
irb(main):001:0> 0.1 + 0.2
=> 0.30000000000000004
$ lua
Lua 5.3.2 Copyright (C) 1994-2015 Lua.org, PUC-Rio
> print(0.1 + 0.2)
0.3
Lua is the odd person out here: everyone else got a really long number
with “4” at the end, but Lua just printed 0.3
. What’s going on here?
If you look in the Lua source, you’ll see that it is using printf()
with a %.14g
format string (this technically varies based on the
platform, but it’s probably true on your platform). With this format
string, printf()
is specified to print the double
according to
this algorithm:
- The value rounded to an N-digit decimal value (14 in this
case). For us this yields
0.30000000000000
. - Trailing zeros are removed, yielding
0.3
.
This explains why Lua got the answer it did. But what about the other implementations. They printed a number that was much longer. It still wasn’t the number’s true value – that would be an even longer:
0.3000000000000000444089209850062616169452667236328125
Why did the other languages all decide to stop printing at that first “4”? The fact that they all print the same thing should hint to us that there is something significant about that answer.
The other three languages all follow this rule: print the shortest
string that will unambiguously convert back to the same number. In
other words, the shortest string such that float(str(n)) == n
.
(This guarantee of course doesn’t apply to Infinity and NaN).
So while the values printed by the other three languages are not
exact, they are unique. No two values will map to the same string.
And each string will map back to the correct float. These are useful
properties, even if they do cause confusion sometimes by hiding the
fact that float(0.1)
is not exactly 0.1
.
We can ask one more question about Lua. If you analyze the precision
available in a double (which has a 52-bit mantissa), you can work out
that 17 decimal digits is enough to uniquely identify every possible
value. In other words, if Lua used the format specifier %.17g
instead of %.14g
, it would also have the property that
tonumber(tostring(x)) == x
Why not do that, so that Lua’s number to
string formatting can also precisely represent the underlying value?
I can’t find any reference where the Lua authors directly explain their motivation on this (someone did ask once but the author’s response didn’t give a specific rationale). I can speculate though.
If we try that out, the downside quickly becomes clear. Let’s use
printf
from Ruby to demonstrate:
$ irb
irb(main):001:0> printf("%.17g\n", 0.1)
0.10000000000000001
Ah, we’ve lost the property that 0.1
prints as 0.1
. That trailing
1
isn’t junk – as we saw at the beginning of the article, the
precise value of this number does include about 40 more digits of
real, non-zero data. But the extra digits aren’t necessary for
uniqueness, since float(0.1)
and float(0.10000000000000001)
map to
exactly the same value. I am guessing that the Lua authors decided
that making these common cases print short strings was more important
than capturing full precision. In Lua you can always use
string.format('%.17g', num)
if you really want to.
The printf()
function doesn’t offer the functionality of “shortest
unambiguous string.” The best it can do is omit trailing zeros.
There is no printf()
format specifier that will do what Ruby,
Python, and JavaScript are doing above. And since Lua is trying to
stay small, it wouldn’t make sense to include this somewhat
complicated functionality.
How to calculate this “shortest unambiguous string” efficiently is more tricky than you might expect. In fact the best known algorithms for calculating it were published in 2010 (Printing Floating-Point Numbers Quickly and Accurately with Integers) and in 2016 (Printing Floating-Point Numbers: A Faster, Always Correct Method). There is a surprising amount of work that goes into these most basic and low-level problems in Computer Science!