Notes A5: Numerical Issues in Scientific Programming

Math 60 -- D. C. Smolarski, S.J.
Santa Clara University, Department of Mathematics and Computer Science

[Return to Math 60 Homepage | Return to Notes Listing Page]

Contents


Introduction

There are at least three major areas of concern regarding numerical issues (developed further in Math 166 or AMATH 118):
  1. precision;
  2. relative compatibility of magnitudes of operands in real arithmetic;
  3. reduction of arithmetic operations.
Some comments about each of these (and longer comments below):

Precision

Novice programmers often forget that computers deal with finite precision arithmetic and that "fractions" do not exist in most languages. Thus (assuming integer arithmetic),
         1/9 * 45
results in zero rather than 5 since 1/9 (with integer arithmetic) produces a zero.

Even assuming real arithmetic, the decimal expansion of 1/9 is limited to the number of places allotted for real numbers. Thus it is possible that 1.0/9.0 * 9.0 will NOT equal 1.0.

On some machines, 1.0/9.0 may equal 0.1111111, which when multiplied by 9.0 results in .99999999 rather than 1.0000.

Relative Compatibility

Dealing with a limited number of decimal digits suggests that sometimes adding a small number to a large number means that the sum is equal to the large number alone. For example, suppose a = 3256.1 (or 0.32561e04) and b = 0.000008263 (or 0.82630e-5). What happens if you were to add a and b? There is an exponent range of 9 places. If you were using single precision variables with an accuracy of 7 places, even though each number by itself would be stored with full accuracy, the sum would be no different than the value of a by itself.

This has implications about repeated additions or multiplications. A good rule is to arrange computations such that operations are performed, as much as possible, on numbers of the relatively same magnitudes. Thus small numbers should be added to small numbers and large numbers to large numbers.

As an example, one can get different answers if one adds the harmonic series

  1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 + ... + 1/30,000
starting with the one or starting with the end term.

See this sample program.

Reduction of Arithmetic Operations

There is a standard norm of practice that fewer arithmetic operations leads to less propagation of numerical inaccuracies. Thus, whenever possible, programmers attempt to reduce arithmetic operation.

Sometimes formulas for functions (e.g., sin x, cos x, ex) are based on the Taylor or Maclaurin Series expansions (cf. Stewart, Calculus). For example,

sin x = x - x3/3! + x5/5! - x7/7! + ...
and
e x = 1 + x/1! + x2/2! + x3/3! + ...

Now, one way to reduce the number of operations is to make use of Horner's Rule in which powers of a variable are changed into nested multiplications. For example,


       ax3 + bx2 + cx + d

can be rewritten as

       ((ax + b)x + c)x +d

The first expression contains 2 powers, 3 multiplications and 3 additions. The second expression contains 0 powers, 3 multiplications and 3 additions, a reduction of 2 power operations.

It is possible to rewrite a short version of ex as given above,

e x approx= x3/3! + x2/2! + x/1! + 1
as
((x/3! + 1/2!)x + 1)x + 1

See this sample program.


This page is maintained by Dennis C. Smolarski, S.J. dsmolarski@math.scu.edu
© Copyright 2000 Dennis C. Smolarski, SJ, All rights reserved.
Last changed: 11 February 2000.