Consider allowing JavaScript. Otherwise, you have to be proficient in reading LaTeX since formulas will not be rendered. Furthermore, the table of contents in the left column for navigation will not be available and code-folding not supported. Sorry for the inconvenience.

Examples in this article were generated with R 4.1.0 by the packages Rmpfr,1 microbenchmark,2 and rational.3

See also a collection of other articles.

  • The right-hand badges give the respective section’s ‘level’.
    
  1. Basics – requiring no or only limited expertise in computer languages or spreadsheets.
    
  1. These sections are the most important ones. They are – hopefully – easily comprehensible even for novices.
    
  1. A somewhat higher knowledge of computer languages (and especially R) is required. May be skipped or reserved for a later reading.
    
  1. An advanced knowledge of R is required. Definitely not recommended for beginners.
  • Click to show / hide R code.

Introduction

    

Shall we bother about numeric precision?

This article was inspired by a thread at the BEBA Forum.
It all started with an observation in Excel.

When you enter …

0.5 - 0.4 - 0.1 you will get

0

Well, as expected. But when you enter …

(0.5 - 0.4 - 0.1) you will get

-2.77556E-17

What? Strange at least.

top of section ↩︎

Other Software

Gnumeric

0.5 - 0.4 - 0.1
Doesn’t matter whether parentheses are used or not (as in all others below).

Maxima

0.5 - 0.4 - 0.1

GNU Octave

0.5 - 0.4 - 0.1

OpenOffice Calc

0.5 - 0.4 - 0.1
Amazing!

R

(0.5 - 0.4 - 0.1)
# [1] -2.775558e-17

Gimme more digits, pleeze!

cat(format((0.5 - 0.4 - 0.1), digits = 16), "\n")
# -2.775557561562891e-17

Root cause analysis

m <- matrix(data = c(0.5, -0.4, -0.1,
                    (0.5 - 0.4 - 0.1)),
            dimnames = list(c("a", "b", "c",
                              "sum(a, b, c)"),
                              "value"))
print(m, digits = 17)
#                                value
# a             5.0000000000000000e-01
# b            -4.0000000000000002e-01
# c            -1.0000000000000001e-01
# sum(a, b, c) -2.7755575615628914e-17

Note the last decimal place of b and c!

Reach for the stars, even if you have to stand on a cactus.
Susan Longacre

Arbitrarily accurate computation with R is provided by the package Rmpfr.

suppressMessages(library(Rmpfr)) # attach it
mpfr(c(0.5, -0.4, -0.1), prec = 260)
# 3 'mpfr' numbers of precision  260   bits 
# [1]                                                        0.5
# [2]   -0.40000000000000002220446049250313080847263336181640625
# [3] -0.1000000000000000055511151231257827021181583404541015625
mpfr(0.5 - 0.4 - 0.1, prec = 260)
# 1 'mpfr' number of precision  260   bits 
# [1] -2.77555756156289135105907917022705078125e-17

OK, more digits but still not what we expect.

Another example:

x <- seq(0.40, 0.43, 0.01)
print(x); print(x, digits = 17); mpfr(x, prec = 260)
# [1] 0.40 0.41 0.42 0.43
# [1] 0.40000000000000002 0.41000000000000003
# [3] 0.42000000000000004 0.42999999999999999
# 4 'mpfr' numbers of precision  260   bits 
# [1]  0.40000000000000002220446049250313080847263336181640625
# [2]  0.41000000000000003108624468950438313186168670654296875
# [3]  0.42000000000000003996802888650563545525074005126953125
# [4] 0.429999999999999993338661852249060757458209991455078125

Actually it turned out to be the most frequently asked question about R, the (in)famous FAQ 7.31.4

What happened here? We have fallen into the trap of floating point arithmetic.

top of section ↩︎ previous section ↩︎

History

    

The first attempts to build something we call now a computer5 were made by Charles Babbage prior to 1840. Of course, both the Difference Engine and the Analytical Engine were – as their names suggest – purely mechanical. The latter was already programmable (its instruction set developed by Ada Lovelace). Their numeral system was decimal and hence, the examples above likely would have easily worked.

In 1941 Konrad Zuse completed construction of the Z3, which was the first digital computer. Once you deal with electrics (relays, vacuum tubes), it’s clear why Zuse decided to work with binary digits.
The signal is either off 0 or on 1.

top of section ↩︎ previous section ↩︎

Numeral Systems


Fig.1 Binary numeral system – Gottfried Wilhelm Leibniz 1697.
De Dyadicis Signature LH XXXV, III B 1, Bl. 1-4

    

To convert a decimal number to binary digit, we have to split the number into its integer and fractional part. Procedure for 10.125 as an example.

10 / 2 = 5: r 0
 5 / 2 = 2: r 1
 2 / 2 = 1: r 0
 1 / 2 = 0: r 1

reordered from least significant bit upwards → [1010]2

0.125 × 2 = 0.25: d 0
0.25  × 2 = 0.50: d 0
0.50  × 2 = 1   : d 1

complete: [10.125]10 = [1010.001]2

Now we will see why 0.5–0.4–0.1 is so difficult in the binary system.

[0.5]10 = [0.1]2
[0.4]10 = [0.011001100110011001100110011001100110…]2 = [0.0110011]2
[0.1]10 = [0.0001100110011001100110011001100110011…]2 = [0.00011]2

Only real numbers \(\small{\mathbb{R}\subset2^{\,\mathbb{Z}}}\), where \(\small{\mathbb{Z}}\) is an integer {…, –1, 0, 1, …}, can be converted to a binary number without a remainder. This works for 0.5 (= 2–1) but not for 0.4 and 0.5; we get periodic numbers, which cannot be stored in the binary format without truncation.6

According to IEEE 754 a binary in double precision holds 64 bits (where 1 bit is the sign, 11 the exponent, and 52 the mantissa). That means 53 bits precision, which translates into ~15.7 digits decimal.

Then Ohlbe posted this example, which cought me on the wrong foot first. It boiled down to the question why the second one works (although 5 is not a multiple of two):

0.5 - 0.4 - 0.1 == 0
# [1] FALSE
5 - 4 - 1 == 0
# [1] TRUE

[5]10 = [101]2
[4]10 = [100]2 
[1]10 = [1]2

Easy.
I overlooked that the conversion to a binary works for any integer \(\small{\mathbb{Z}}\) within the range of \(\small{\left\{-2^{-31}=-2,147,483,648\,\ldots\,2^{31}-1=2,147,483,647\right\}}\).

 [2147483647]10 = +[1111111111111111111111111111111]2
-[2147483648]10 = -[0000000000000000000000000000000]2

a <- as.integer(.Machine$integer.max)
a; class(a)
# [1] 2147483647
# [1] "integer"
b <- as.integer(1)
c <- as.numeric(1)
d <- a + b  # does not work
# Warning in a + b: NAs produced by integer overflow
d <- a + c  # works
d; class(d) # type conversion!
# [1] 2147483648
# [1] "numeric"

What about the commutative property?

a <-  0.5
b <- -0.4
c <- -0.1
a + b + c == (a +  b) + c
# [1] TRUE
a + b + c ==  a + (b  + c)
# [1] FALSE
a + b + c == (a +  b  + c)
# [1] TRUE
a <-  5
b <- -4
c <- -1
a + b + c == (a +  b) + c
# [1] TRUE
a + b + c ==  a + (b  + c)
# [1] TRUE
a + b + c == (a +  b  + c)
# [1] TRUE

When dealing with floating point arithmetic, the order (and parentheses) matter (THX to mittyri).

top of section ↩︎ previous section ↩︎

To Do and Not to Do

    

If you want to compare double precision numbers (say, in the logical construct of a script, i.e., if(), while(), repeat()), do not use these goodies:

a <- 0.5 - 0.4 - 0.1
b <- 0
a == b          # most commonly used
# [1] FALSE
identical(a, b) # not better
# [1] FALSE
    

BTW, identical() can give unexpected results.

a <- 2147483647
b <- 2147483647L
a == b
# [1] TRUE
identical(a, b)
# [1] FALSE
class(a); class(b)
# [1] "numeric"
# [1] "integer"

Here testing for equality passes because the numbers are not above to maximum possible integer of the system, i.e., 231–1 (actually .Machine$integer.max).
However, testing with identical()fails because it compares not only the values but also their classes.

    

Instead use:

a <- 0.5 - 0.4 - 0.1
b <- 0
all.equal(a, b)
# [1] TRUE

Only this function compares values based on the numeric resolution of a 64 bit double precision numeric, which is \(\small{\sim2.220446\cdot10^{-16}}\). Actually the comparison is performed at its square root or \(\small{\sim1.49011610^{-8}}\).

The source of all.equal() is lengthy but we can mimick what goes on behind the curtain.

a <- 0.5 - 0.4 - 0.1
b <- 0
all.equal(a, b)
# [1] TRUE
sqrt(.Machine$double.eps) >= abs(a - b)
# [1] TRUE

We can get \(\pi\) with up to 16 correct significant digits.

cat(formatC(pi, digits = 16, small.mark = "\u2219",
                small.interval = 3), "\n")
# 3.141·592·653·589·793

But again, don’t dare to ask for more digits. As we have seen above, anything shown beyond the 16th significant digit is just ‘noise’.

sm      <- "\u2219"
pi.16   <- formatC(pi, digits = 16, small.mark = sm,
                  small.interval = 3)
pi.31   <- formatC(pi, digits = 31, small.mark = sm,
                  small.interval = 3)
pi.corr <- paste0("3.141", sm, "592", sm, "653", sm,
                    "589", sm, "793", sm, "238", sm,
                    "462", sm, "643", sm, "383", sm,
                    "279")
df <- data.frame(value = c("64 bit max",
                           "64 bit \u2018noise\u2019",
                           "correct"),
                 pi = c(sprintf("%-40s%11s", pi.16, ""),
                        sprintf("%-40s%2s", pi.31, ""),
                        sprintf("%-40s%1s%1s",
                                pi.corr, sm, "\u2026")))
print(df, row.names = FALSE)
#           value                                          pi
#      64 bit max 3.141·592·653·589·793                      
#  64 bit ‘noise’ 3.141·592·653·589·793·115·997·963·468·544  
#         correct 3.141·592·653·589·793·238·462·643·383·279·…

Honestly, I don’t know why it is possible to ask R for more than 16 digits. At least it should throw a message like
      Please use your wetware before asking!

We know that \(\sin\pi=0\). Given the above, can we really we hope for that?

sin(pi)
# [1] 1.224606e-16

Now you shouldn’t be surprised any more.

Will we fare better with Rmpfr?

pi. <- Const("pi", prec = 260)
sin(pi.)
# 1 'mpfr' number of precision  260   bits 
# [1] 1.7396371592546498568836643545648074661258190954152778051042783221068713118047497e-79

Seems that to hope for zero is futile. However, this ‘better’ result comes with a price, speed.

library(microbenchmark) # attach it
res <- microbenchmark(sin(pi), sin(pi.),
                      unit = "relative", times = 1000L)
print(res, signif = 4)
# Unit: relative
#      expr min    lq  mean median    uq   max neval cld
#   sin(pi) NaN   1.0   1.0    1.0   1.0  1.00  1000  a 
#  sin(pi.) Inf 352.9 289.6  354.7 178.9 69.33  1000   b

A funny one in Excel discovered by mittyri:

1.2E+200 + 1E+100   1.2E+200 + 1E+100

1.2e+200 + 1e+100
# [1] 1.2e+200

Bad luck (exhausting the double precision). It doesn’t make sense to hope for
 12 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 001 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000.
Can you spot the 1?

top of section ↩︎ previous section ↩︎

For Geeks only!

Epilogue

    

That’s fascinating. When I copied the output of  π  to the clipboard it showed 34 significant digits (3.141592653589793238462643383279503), which is correct to the penultimate digit (the last one is rounded up from 28). How is that possible? I consulted the documentation. Numbers are represented with a 34 digit mantissa and an exponent from 10−6,413 to 106,144 i.e., handled in extended precision!
Even the original HP-42S (1988!) represented numbers with a 12 digit mantissa and an exponent from 10−499 to 10499, which is larger than the IEEE-754 double precision range of 10−308 to 10308. I’m impressed.

Fig.3 Windows 7 calc.exe

  π   gives 32 correct significant digits.   sin   zero?

 Sorry, that’s beyond me. 

top of section ↩︎ previous section ↩︎

Post Scriptum

Just to set all of this into perspective:
The diameter of human hair is ~10–12 of the the earth-moon distance. One Nanometer is ~10–13 of the earth’s equator. Should we really be concerned about an ‘error’ which is three or more orders of magnitude smaller? 😉
Our measurements are likely never that precise anyway. Furthermore, the standard defines sophisticiated rounding routines. Hence, in repeated calculations the error will not propagate upwards.

Hence, the answer to the question
    Shall we bother about numeric precision?
is in general
    No.

The physical constant with the highest precision is the Faraday constant F (9.648 533 212 331 001 84·104 A·s·mol–1) with 18 significant digits. Unless you are an experimental physicist, double precision numbers are fine. Otherwise, opt for a language supporting extended precision (GCC C/C++, Clang, Intel C++, Object Pascal, Racket, Swift) or get a suitable scientific pocket calculator.

top of section ↩︎ previous section ↩︎

Acknowledgment

Members of the BEBA-Forum: ElMaestro, mittyri, Ohlbe, PharmCat, Shuanghe, and zizou.

License

CC BY 4.0 Helmut Schütz 2021
1st version March 14, 2021.
Rendered 2021-06-23 14:03:51 CEST by rmarkdown in 1.35 seconds.

Footnotes and References


  1. Maechler M, Heiberger RM, Nash JC, Borchers HW. Rmpfr: R MPFR - Multiple Precision Floating-Point Reliable. 2021-04-11. CRAN.↩︎

  2. Mersmann O, Beleites C, Hurling R, Friedman A, Ulrich JM. microbenchmark: Accurate Timing Functions. 2019-09-24. CRAN.↩︎

  3. Carnell R. rational: An R rational number class using a variety of class systems. 2021. GitHub.↩︎

  4. Kurt Hornik. R FAQ. Frequently Asked Questions on R. 2020-02-20. 7.31 Why doesn’t R think these numbers are equal?↩︎

  5. Before the late 1940s computer was a job description: A person performing mathematical calculations.↩︎

  6. David Goldberg. What Every Computer Scientist Should Know About Floating-Point Arithmetic. ACM Computing Surveys. 1991; 23(1): 5–48. doi:10.1145/103162.103163.  Open Access.↩︎

  7. Hadley Wickham. Advanced R. R6. 2019.↩︎