The Wichmann-Hill Random Number Algorithm

Random number generator algorithms are notoriously tricky. Recently I was reading about RNGs and learned that Excel 2003 had a seriously broken random number function that used the Wichmann-Hill algorithm. I decided to take a closer look and see if I could guess how Excel botched a relatively simple (but tricky) algorithm.

A linear congruential random number generator returns a number between [0.0 and 1.0) using the equation:

X(i) = [(a * X(i-1) + c) mod m] / m

In words, “the new random number is some constant a times the previous random number, plus some constant c, modulo some constant m (giving an integer part), then divided by m (to give a value between 0.0 and 1.0). For example suppose the integer part of the last generated value was 34, and a = 13, and c = 5, and m = 100. Then the next random value is (13 * 34 + 5) mod 100 / 100 = 447 mod 100 / 100 = 0.47.

The Wichmann-Hill (WH) algorithm is an extension of linear congruential. WH uses three linear congruential equations, adds the results, then takes the decimal part of the sum. For example, if the three WH equations give 0.68, 0.92, and 0.35, then the final result random number is decimal(0.78 + 0.92 + 0.45) = decimal(2.95) = 0.95.


I found the original WH research paper from 1982. It was fascinating. The paper gave a code implementation in Fortran. I suspect the Excel developers botched the part of WH that takes the decimal part of the sum. The Fortran code uses AMOD(sum, 1.0) which means the sum of the three equation modulo 1.0.

If you compute integer x mod 1 in most programming languages, you get the remainder after division by 1 which is always 0. For example 17 mod 1 is 0 because 1 goes into 17, 17 times with 0 left over. But in Fortran you get the decimal part of a real number, for example, AMOD(3.456, 1.0) = 0.456.

I suspect the Excel developers saw the mod 1.0 in the Fortran code and thought, “Hmm, modulo only works with integers. The research article uses 1.0. Must be some weird Fortran thing.” So maybe they used some other way to get the decimal part of the sum and botched it up.

This entry was posted in Machine Learning. Bookmark the permalink.