#include <stdio.h>
int main() {
double x = 0.1;
printf("%f\n", x);
}
Here is the output I get on my system:
0.100000You might be tempted to think that the variable
x is indeed equal to the value 0.1, but don't be fooled! In fact 0.1 is a rounded version of x's true value, which will become apparent if we ask for more precision:
#include <stdio.h>
int main() {
double x = 0.1;
printf("%.30f\n", x);
}
0.100000000000000005551115123126It's hard to understand what we can't easily inspect. To remedy this, I've just written a new tool called dumpfp. Thing of it as the floating-point
toString() method you never had. It prints the precise value of the number, in both rational and decimal forms:
$ ./dumpfp 0.1
Single Precision (IEEE 32-bit):
raw = 0x3dcccccd
sign = 0x0
exponent = 0x7b (-4)
significand = 0x4ccccd
VALUE CALCULATION =
significand (1 + 5033165/2^23 (1.60000002384185791015625))
* 2^exponent (2^-4)
= VALUE (13421773/2^27 (0.100000001490116119384765625))
Double Precision (IEEE 64-bit):
raw = 0x3fb999999999999a
sign = 0x0
exponent = 0x3fb (-4)
significand = 0x999999999999a
VALUE CALCULATION =
significand (1 + 1351079888211149/2^51 (1.600000000000000088817841970012523233890533447265625))
* 2^exponent (2^-4)
= VALUE (3602879701896397/2^55 (0.1000000000000000055511151231257827021181583404541015625))
Notice that the actual value was not exactly 0.1, because binary floating point can only exactly represent rational values with a power-of-two denominator. You can see that the approximation is closer for double than it is for floatThe tool first breaks down the raw bytes of the value into its constituent parts. IEEE floating-point values consist of a sign bit, some number of exponent bits, followed by a significand. These values are then combined together according to the expression
significand * 2^exponent. The tool will show you all of the intermediate values and the final resultYou'll notice that these numbers have many decimal digits. Not all of these are significant. We can calculate how many digits are significant but I couldn't think of an easy-to-read way of printing the exact value but also indicating which digits are significant. If anyone has a bright idea here, please do drop me a line (or fork me on GitHub).
The tool's output is less noisy for a value that can be represented exactly:
$ ./dumpfp 0.5
Single Precision (IEEE 32-bit):
raw = 0x3f000000
sign = 0x0
exponent = 0x7e (-1)
significand = 0x0
VALUE CALCULATION =
significand (1 + 0/2^0 (1.0))
* 2^exponent (2^-1)
= VALUE (1/2^1 (0.5))
Double Precision (IEEE 64-bit):
raw = 0x3fe0000000000000
sign = 0x0
exponent = 0x3fe (-1)
significand = 0x0
VALUE CALCULATION =
significand (1 + 0/2^0 (1.0))
* 2^exponent (2^-1)
= VALUE (1/2^1 (0.5))
You can also use it for special values like NaN or Infinity:
$ ./dumpfp nan
Single Precision (IEEE 32-bit):
raw = 0x7fc00000
sign = 0x0
exponent = 0xff (NaN or Infinity)
significand = 400000 (non-zero indicates NaN)
Double Precision (IEEE 64-bit):
raw = 0x7ff8000000000000
sign = 0x0
exponent = 0x7ff (NaN or Infinity)
significand = 8000000000000 (non-zero indicates NaN)
And for very large values it will print the full integer value:
$ ./dumpfp 1e30
Single Precision (IEEE 32-bit):
raw = 0x7149f2ca
sign = 0x0
exponent = 0xe2 (99)
significand = 0x49f2ca
VALUE CALCULATION =
significand (1 + 2423141/2^22 (1.5777218341827392578125))
* 2^exponent (2^99)
= VALUE (1000000015047466219876688855040)
Double Precision (IEEE 64-bit):
raw = 0x46293e5939a08cea
sign = 0x0
exponent = 0x462 (99)
significand = 0x93e5939a08cea
VALUE CALCULATION =
significand (1 + 1300913865115253/2^51 (1.577721810442023642195863430970348417758941650390625))
* 2^exponent (2^99)
= VALUE (1000000000000000019884624838656)
I learned a lot by writing this tool, and I hope it helps you understand floating-point better. Floating-point numbers don't have to be that mysterious: they have very specific values as we can see. The trickiness comes from the fact that values get rounded if they can't be represented exactly; hopefully this tool will make it clear when a value can be represented or not.For example, one thing that can run you into trouble is trying to add two numbers where one is much larger than the other. Suppose you wanted to add
1 + 1e-16. The result should be 1.00000000000000001, but can this number be represented in floating-point?
$ ./dumpfp 1.00000000000000001
Single Precision (IEEE 32-bit):
raw = 0x3f800000
sign = 0x0
exponent = 0x7f (0)
significand = 0x0
VALUE CALCULATION =
significand (1 + 0/2^0 (1.0))
* 2^exponent (2^0)
= VALUE (1 + 0/2^0 (1.0))
Double Precision (IEEE 64-bit):
raw = 0x3ff0000000000000
sign = 0x0
exponent = 0x3ff (0)
significand = 0x0
VALUE CALCULATION =
significand (1 + 0/2^0 (1.0))
* 2^exponent (2^0)
= VALUE (1 + 0/2^0 (1.0))
We see here that the number 1.00000000000000001 can't be represented in floating-point, and the closest approximation we had available to us was 1.0. We lost the smaller number completely! It's not that operations like addition are imprecise, it's that the result of the operation might not be possible to store in a floating-point value, so it rounds to the nearest representable number.
Try colored or bold text for indicating significant digits. It works well for grep and other tools.
ReplyDeleteGood idea Dave -- I'll see about adding that.
DeleteI've never used math.h / cmath's frexp[f,l] before but I think it would remove your dependency on GMP and obviate any worrying about endianness, assuming it performs as stated.
ReplyDeleteThanks for the suggestion Ben, I hadn't heard of frexp. I don't think it removes the need for GMP though; I want to print the precise value in decimal (with all of its digits, even insignificant ones), which I need arbitrary precision arithmetic to do AFAICS.
Delete