What’s the difference between a single precision and double precision floating point operation?
- float:精度范围 10−38∼1038" role="presentation">10−38∼1038
- exp(−102)≈10−44" role="presentation">exp(−102)≈10−44 ,float 下溢
- double:精度范围 10−308∼10308" role="presentation">10−308∼10308
- exp(−103)≈10−434" role="presentation">exp(−103)≈10−434,double 下溢;
0. 64-bits CPU
如果说一个 CPU 是 64 位机,通常意味着,其具有 64 位的通用寄存器(general purpose register)以及内存地址空间的大小(memory address size),这与最终执行的数学运算,是单精度还是双精度,没有关系。
1. 单精度
S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF
0 1 8 9 31
- 第 1 个 bit 位,表示的是符号位,S;
- 中间 8 位,表示指数部分,E;
- 末尾的 23 位,则表示小数部分,F;
- E=0,F=0,S=1,=> -0
- E=0,F=0,S=0,=> 0
- 0
0 00000000 00000000000000000000000 = 0
E=0,F=0,S=0,=> 0
1 00000000 00000000000000000000000 = -0
E=0,F=0,S=1,=> -0
0 11111111 00000000000000000000000 = Infinity
1 11111111 00000000000000000000000 = -Infinity
0 11111111 00000100000000000000000 = NaN
E=255,F 非零
1 11111111 00100010001001010101010 = NaN
E=255,F 非零
0 10000000 00000000000000000000000 = +1 * 2**(128-127) * 1.0 = 2
0 10000001 10100000000000000000000 = +1 * 2**(129-127) * 1.101 = 6.5
1.101 => 1+0.5+0.125=1.625
1 10000001 10100000000000000000000 = -1 * 2**(129-127) * 1.101 = -6.5
0 00000001 00000000000000000000000 = +1 * 2**(1-127) * 1.0 = 2**(-126)
0 00000000 10000000000000000000000 = +1 * 2**(-126) * 0.1 = 2**(-127)
0 00000000 00000000000000000000001 = +1 * 2**(-126) *
0.00000000000000000000001 =
2**(-149) (Smallest positive value)
2. 双精度
S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
0 1 11 12 63