Floating-Point Arithmetic 8 13
approximation to xp( x) = ln(l + x). Is
there a value for 5 for which 2 and
5 + 1 can be computed accurately? There
is; namely,
2 = (1 @ x) e 1, because
then 1 + 2 is exactly equal to 1 @ x.
The results of this section can be sum-
marized by saying that a guard digit
guarantees accuracy when nearby pre-
cisely known quantities are subtracted
(benign cancellation). Sometimes a for-
mula that gives inaccurate results can be
rewritten to have much higher numeri -
cal accuracy by using benign cancella-
tion; however, the procedure only works
if subtraction is performed using a guard
digit. The price of a guard digit is not
high because is merely requires making
the adder 1 bit wider. For a 54 bit double
precision adder, the additional cost is less
than 2%. For this price, you gain the
ability to run many algorithms such as
formula (6) for computing the area of a
triangle and the expression in Theorem 4
for computing ln(l + ~). Although most
modern computers have a guard digit,
there are a few (such as Crays) that
do not.
1.5 Exactly Rounded Operations
When floating-point operations are done
with a guard digit, they are not as accu-
rate as if they were computed exactly
then rounded to the nearest floating-point
number. Operations performed in this
manner will be called
exactly rounded.
The example immediately preceding
Theorem 2 shows that a single guard
digit will not always give exactly rounded
results. Section 1.4 gave several exam-
ples of algorithms that require a guard
digit in order to work properly. This sec-
tion gives examples of algorithms that
require exact rounding.
So far, the definition of rounding has
not been given. Rounding is straightfor-
ward, with the exception of how to round
halfway cases; for example, should 12.5
mnnd to 12 OP12? Ofie whool of thought
divides the 10 digits in half, letting
{0, 1,2,3,4} round down and {5,6,’7,8,9}
round up; thus 12.5 would round to 13.
This is how rounding works on Digital
Equipment Corporation’s VAXG comput -
ers. Another school of thought says that
since numbers ending in 5 are halfway
between two possible roundings, they
should round down half the time and
round up the other half. One way of ob -
taining this 50’%0behavior is to require
that the rounded result have its least
significant digit be even. Thus 12.5
rounds to 12 rather than 13 because 2 is
even. Which of these methods is best,
round up or round to even? Reiser and
Knuth [1975] offer the following reason
for preferring round to even.
Theorem 5
Let x and y be floating-point numbers,
and define X. = x,
xl=(xOey)O
y,...,=(x(ley)@y)If@If@ and
e are exactly rounded using round to
even, then either x. = x for all n or x. = xl
foralln >1.
❑
To clarify this result, consider ~ = 10,
p = 3 and let x = 1.00, y = –.555.
When rounding up, the sequence be-
comes
X. 9 Y = 1.56, Xl = 1.56 9 .555
= 1.01, xl e y ~ LO1 Q .555 = 1.57,
and each successive value of x. in-
creases by .01. Under round to even, x.
is always 1.00. This example suggests
that when using the round up rule, com-
putations can gradually drift upward,
whereas when using round to even the
theorem says this cannot happen.
Throughout the rest of this paper, round
to even will be used.
One application of exact rounding oc-
curs in multiple precision arithmetic.
There are two basic approaches to higher
precision. One approach represents float -
ing-point numbers using a very large sig-
nificant, which is stored in an array of
words, and codes the routines for manip-
ulating these numbers in assembly lan-
guage. The second approach represents
higher precision floating-point numbers
as an array of ordinary floating-point
‘VAX is a trademark of Digital Equipment
Corporation.
ACM Computmg Surveys, Vol 23, No. 1, March 1991