Consider allowing JavaScript. Otherwise, you have to be proficient in reading since formulas will not be rendered. Furthermore, the table of contents in the left column for navigation will not be available and code-folding not supported. Sorry for the inconvenience.
Examples in this article were generated with 4.0.5 by the packages
Rmpfr
,1 microbenchmark
,2 and rational
.3
See also a collection of other articles.
Shall we bother about numeric precision?
This article was inspired by a thread at the BEBA Forum.
It all started with an observation in Excel.
When you enter …
you will get
Well, as expected. But when you enter …
you will get
What? Strange at least.
Doesn’t matter whether parentheses are used or not (as in all others below).
Amazing!
R
0.5 - 0.4 - 0.1) (
R> [1] -2.775558e-17
Gimme more digits, pleeze!
cat(format((0.5 - 0.4 - 0.1), digits = 16), "\n")
R> -2.775557561562891e-17
Root cause analysis
matrix(data = c(0.5, -0.4, -0.1,
m <-0.5 - 0.4 - 0.1)),
(dimnames = list(c("a", "b", "c",
"sum(a, b, c)"),
"value"))
print(m, digits = 17)
R> value
R> a 5.0000000000000000e-01
R> b -4.0000000000000002e-01
R> c -1.0000000000000001e-01
R> sum(a, b, c) -2.7755575615628914e-17
Note the last decimal place of b
and c
!
“Reach for the stars, even if you have to stand on a cactus.
Arbitrarily accurate computation with R is provided by the package Rmpfr
.
suppressMessages(library(Rmpfr)) # attach it
mpfr(c(0.5, -0.4, -0.1), prec = 260)
> 3 'mpfr' numbers of precision 260 bits
R> [1] 0.5
R> [2] -0.40000000000000002220446049250313080847263336181640625
R> [3] -0.1000000000000000055511151231257827021181583404541015625
Rmpfr(0.5 - 0.4 - 0.1, prec = 260)
> 1 'mpfr' number of precision 260 bits
R> [1] -2.77555756156289135105907917022705078125e-17 R
OK, more digits but still not what we expect.
Another example:
seq(0.40, 0.43, 0.01)
x <-print(x); print(x, digits = 17); mpfr(seq(0.40, 0.43, 0.01), prec = 260)
R> [1] 0.40 0.41 0.42 0.43
R> [1] 0.40000000000000002 0.41000000000000003
R> [3] 0.42000000000000004 0.42999999999999999
R> 4 'mpfr' numbers of precision 260 bits
R> [1] 0.40000000000000002220446049250313080847263336181640625
R> [2] 0.41000000000000003108624468950438313186168670654296875
R> [3] 0.42000000000000003996802888650563545525074005126953125
R> [4] 0.429999999999999993338661852249060757458209991455078125
Actually it turned out to be the most frequently asked question about R, the (in)famous FAQ 7.31.4
What happened here? We have fallen into the trap of floating point arithmetic.
The first attempts to build something we call now a computer5 were made by Charles Babbage prior to 1840. Of course, both the Difference Engine and the Analytical Engine were – as their names suggest – purely mechanical. The latter was already programmable (its instruction set developed by Ada Lovelace). Their numeral system was decimal and hence, the examples above likely would have easily worked.
In 1941 Konrad Zuse completed construction of the Z3, which was the first digital computer. Once you deal with electrics (relays, vacuum tubes), it’s clear why Zuse decided to work with binary digits.
The signal is either off or on .
Fig.1 Binary numeral system – Gottfried Wilhelm Leibniz 1697. De Dyadicis Signature LH XXXV, III B 1, Bl. 1-4
To convert a decimal number to binary digit, we have to split the number into its integer and fractional part. Procedure for 10.125 as an example.
10 / 2 = 5: r 0
5 / 2 = 2: r 1
2 / 2 = 1: r 0
1 / 2 = 0: r 1
reordered from least significant bit upwards → [1010]2
0.125 × 2 = 0.25: d 0
0.25 × 2 = 0.50: d 0
0.50 × 2 = 1 : d 1
complete: [10.125]10 = [1010.001]2
Now we will see why 0.5–0.4–0.1 is so difficult in the binary system.
[0.5]10 = [0.1]2
[0.4]10 = [0.011001100110011001100110011001100110…]2 = [0.0110011]2
[0.1]10 = [0.0001100110011001100110011001100110011…]2 = [0.00011]2
Only real numbers \(\small{\mathbb{R}\subset2^{\,\mathbb{Z}}}\), where \(\small{\mathbb{Z}}\) is an integer {…, –1, 0, 1, …}, can be converted to a binary number without a remainder. This works for 0.5 (= 2–1) but not for 0.4 and 0.5; we get periodic numbers, which cannot be stored in the binary format without truncation.6
According to IEEE 754 a binary in double precision holds 64 bits (where 1 bit is the sign, 11 the exponent, and 52 the mantissa). That means 53 bits precision, which translates into ~15.7 digits decimal.
Then Ohlbe posted this example, which cought me on the wrong foot first. It boiled down to the question why the second one works (although 5 is not a multiple of two):
0.5 - 0.4 - 0.1 == 0
> [1] FALSE
R5 - 4 - 1 == 0
> [1] TRUE R
[5]10 = [101]2
[4]10 = [100]2
[1]10 = [1]2
Easy.
I overlooked that the conversion to a binary works for any integer \(\small{\mathbb{Z}}\) within the range of \(\small{\left\{-2^{-31}=-2,147,483,648\,\ldots\,2^{31}-1=2,147,483,647\right\}}\).
[2147483647]10 = +[1111111111111111111111111111111]2
-[2147483648]10 = -[0000000000000000000000000000000]2
as.integer(.Machine$integer.max)
a <-class(a)
a; > [1] 2147483647
R> [1] "integer"
R as.integer(1)
b <- as.numeric(1)
c <- a + b # does not work
d <-> Warning in a + b: NAs produced by integer overflow
R a + c # works
d <-class(d) # type conversion!
d; > [1] 2147483648
R> [1] "numeric" R
What about the commutative property?
0.5
a <- -0.4
b <- -0.1
c <-+ b + c == (a + b) + c
a > [1] TRUE
R+ b + c == a + (b + c)
a > [1] FALSE
R+ b + c == (a + b + c)
a > [1] TRUE
R 5
a <- -4
b <- -1
c <-+ b + c == (a + b) + c
a > [1] TRUE
R+ b + c == a + (b + c)
a > [1] TRUE
R+ b + c == (a + b + c)
a > [1] TRUE R
When dealing with floating point arithmetic, the order (and parentheses) matter (THX to mittyri).
If you want to compare double precision numbers (say, in the logical construct of a script, i.e., if()
, while()
, repeat()
), do not use these goodies:
0.5 - 0.4 - 0.1
a <- 0
b <-== b # most commonly used
a > [1] FALSE
Ridentical(a, b) # not better
> [1] FALSE R
BTW, identical()
can give unexpected results.
2147483647
a <- 2147483647L
b <-== b
a > [1] TRUE
Ridentical(a, b)
> [1] FALSE R
class(a); class(b)
> [1] "numeric"
R> [1] "integer" R
Here testing for equality passes because the numbers are not above to maximum possible integer of the system, i.e., 231–1 (actually .Machine$integer.max
).
However, testing with identical()
fails because it compares not only the values but also their classes.
Instead use:
0.5 - 0.4 - 0.1
a <- 0
b <-all.equal(a, b)
> [1] TRUE R
Only this function compares values based on the numeric resolution of a 64 bit double precision numeric, which is \(\small{\sim2.220446\cdot10^{-16}}\). Actually the comparison is performed at its square root or \(\small{\sim1.49011610^{-8}}\).
The source of all.equal()
is lengthy but we can mimick what goes on behind the curtain.
0.5 - 0.4 - 0.1
a <- 0
b <-all.equal(a, b)
> [1] TRUE
Rsqrt(.Machine$double.eps) >= abs(a - b)
> [1] TRUE R
We can get \(\pi\) with up to 16 correct significant digits.
cat(formatC(pi, digits = 16, small.mark = "\u2219",
small.interval = 3), "\n")
R> 3.141·592·653·589·793
But again, don’t dare to ask for more digits. As we have seen above, anything shown beyond the 16th significant digit is just ‘noise’.
"\u2219"
sm <-.16 <- formatC(pi, digits = 16, small.mark = sm,
pismall.interval = 3)
.31 <- formatC(pi, digits = 31, small.mark = sm,
pismall.interval = 3)
paste0("3.141", sm, "592", sm, "653", sm,
pi.corr <-"589", sm, "793", sm, "238", sm,
"462", sm, "643", sm, "383", sm,
"279")
data.frame(value = c("64 bit max",
df <-"64 bit \u2018noise\u2019",
"correct"),
pi = c(sprintf("%-40s%11s", pi.16, ""),
sprintf("%-40s%2s", pi.31, ""),
sprintf("%-40s%1s%1s",
"\u2026")))
pi.corr, sm, print(df, row.names = FALSE)
R> value pi
R> 64 bit max 3.141·592·653·589·793
R> 64 bit ‘noise’ 3.141·592·653·589·793·115·997·963·468·544
R> correct 3.141·592·653·589·793·238·462·643·383·279·…
Honestly, I don’t know why it is possible to ask R for more than 16 digits. At least it should throw a message like
Please use your wetware before asking!
We know that \(\sin\pi=0\). Given the above, can we really we hope for that?
sin(pi)
> [1] 1.224606e-16 R
Now you shouldn’t be surprised any more.
Will we fare better with Rmpfr
?
Const("pi", prec = 260)
pi. <-sin(pi.)
R> 1 'mpfr' number of precision 260 bits
R> [1] 1.7396371592546498568836643545648074661258190954152778051042783221068713118047497e-79
Seems that to hope for zero is futile. However, this ‘better’ result comes with a price, speed.
library(microbenchmark) # attach it
microbenchmark(sin(pi), sin(pi.),
res <-unit = "relative", times = 1000L)
print(res, signif = 4)
R> Unit: relative
R> expr min lq mean median uq max neval cld
R> sin(pi) NaN 1.0 1.0 1.0 1.0 1.00 1000 a
R> sin(pi.) Inf 336.9 317.4 337.7 339.7 50.49 1000 b
A funny one in Excel discovered by mittyri:
1.2e+200 + 1e+100
> [1] 1.2e+200 R
Bad luck (exhausting the double precision). It doesn’t make sense to hope for
12 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 001 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000.
Can you spot the 1?
Fig.2 Free42 v3.0.1
That’s fascinating. When I copied the output of π to the clipboard it showed 34 significant diigits (3.141592653589793238462643383279503), which is correct to the penultimate digit (the last one is rounded up from 28). How is that possible? I consulted the documentation. Numbers are represented with a 34 digit mantissa and an exponent from 10−6,413 to 106,144 i.e., handled in extended precision!
Even the original HP-42S (1988!) represented numbers with a 12 digit mantissa and an exponent from 10−499 to 10499, which is larger than the IEEE-754 double precision range of 10−308 to 10308. I’m impressed.
Fig.3 Windows 7 calc.exe
gives 32 correct significant digits. zero?
Sorry, that’s beyond me.
Just to set all of this into perspective:
The diameter of human hair is ~10–12 of the the earth-moon distance. One Nanometer is ~10–13 of the earth’s equator. Should we really be concerned about an ‘error’ which is three or more orders of magnitude smaller? 😉
Our measurements are likely never that precise anyway. Furthermore, the standard defines sophisticiated rounding routines. Hence, in repeated calculations the error will not propagate upwards.
Hence, the answer to the question
Shall we bother about numeric precision?
is in general
No.
The physical constant with the highest precision is the Faraday constant F (9.648 533 212 331 001 84·104 A·s·mol–1) with 18 significant digits. Unless you are an experimental physicist, double precision numbers are fine. Otherwise, opt for a language supporting extended precision (GCC C/C++, Clang, Intel C++, Object Pascal, Racket, Swift) or get a suitable scientific pocket calculator.
top of section ↩︎ previous section ↩︎
Acknowledgment
Members of the BEBA-Forum: ElMaestro, mittyri, Ohlbe, PharmCat, Shuanghe, and zizou.
License
Helmut Schütz 2021
1st version March 14, 2021.
Rendered 2021-04-13 13:30:21 CEST by rmarkdown in 1.34 seconds.
Footnotes and References
Maechler M, Heiberger RM, Nash JC, Borchers HW. Rmpfr: R MPFR - Multiple Precision Floating-Point Reliable. 2020-11-11. CRAN.↩︎
Mersmann O, Beleites C, Hurling R, Friedman A, Ulrich JM. microbenchmark: Accurate Timing Functions. 2019-09-24. CRAN.↩︎
Carnell R. rational: An R rational number class using a variety of class systems. 2021. GitHub.↩︎
Kurt Hornik. R FAQ. Frequently Asked Questions on R. 2020-02-20. 7.31 Why doesn’t R think these numbers are equal?↩︎
Before the late 1940s computer was a job description: A person performing mathematical calculations.↩︎
David Goldberg. What Every Computer Scientist Should Know About Floating-Point Arithmetic. ACM Computing Surveys. 1991; 23(1), 5–48. doi:10.1145/103162.103163. Open source.↩︎