dumpfp: A Tool to Inspect Floating-Point Numbers
Floating-point math has a reputation for being very unpredictable and hard to understand. I think that a major reason for this is that floating-point values are hard to inspect. Take the following program:
Here is the output I get on my system:
You might be tempted to think that the variable x
is indeed equal
to the value 0.1, but don’t be fooled! In fact 0.1 is a rounded
version of x
’s true value, which will become apparent if we ask for
more precision:
It’s hard to understand what we can’t easily inspect. To remedy
this, I’ve just written a new tool called dumpfp. Thing of it as
the floating-point toString()
method you never had. It prints the
precise value of the number, in both rational and decimal forms:
Notice that the actual value was not exactly 0.1
, because binary
floating point can only exactly represent rational values with a
power-of-two denominator. You can see that the approximation is
closer for double
than it is for float
The tool first breaks down the raw bytes of the value into its
constituent parts. IEEE floating-point values consist of a sign bit,
some number of exponent bits, followed by a significand. These values
are then combined together according to the expression significand *
2^exponent
. The tool will show you all of the intermediate values
and the final result
You’ll notice that these numbers have many decimal digits. Not all of these are significant. We can calculate how many digits are significant but I couldn’t think of an easy-to-read way of printing the exact value but also indicating which digits are significant. If anyone has a bright idea here, please do drop me a line (or fork me on GitHub).
The tool’s output is less noisy for a value that can be represented exactly:
You can also use it for special values like NaN
or Infinity
:
And for very large values it will print the full integer value:
I learned a lot by writing this tool, and I hope it helps you understand floating-point better. Floating-point numbers don’t have to be that mysterious: they have very specific values as we can see. The trickiness comes from the fact that values get rounded if they can’t be represented exactly; hopefully this tool will make it clear when a value can be represented or not.
For example, one thing that can run you into trouble is trying to add
two numbers where one is much larger than the other. Suppose you
wanted to add 1 + 1e-16
. The result should be
1.00000000000000001
, but can this number be represented in
floating-point?
We see here that the number 1.00000000000000001
can’t be represented
in floating-point, and the closest approximation we had available to
us was 1.0
. We lost the smaller number completely! It’s not that
operations like addition are imprecise, it’s that the result of the
operation might not be possible to store in a floating-point value, so
it rounds to the nearest representable number.