## Estimating Binomial Probabilities using the Normal Approximation and C#

Suppose you flip a coin 10 times. What is the probability you’ll get exactly 2 heads? This is an example of a binomial probability problem. In math terms n is the number of trials (10 for the example), k is the number of “successes” (2), and p is the probability of getting a success in a single trial. Because heads and tails are equally likely, p = 0.5 in this example.

From Wikipedia, the equation for calculating Pr(k | n,p) is:

The trickiest part of computing binomial probabilities exactly is calculating the Choose(n,k) function because Choose() can return astronomically large values for even moderate size of n and k. For example, Choose(500, 100) =

204,169,423,593,847,671,561,387,240,724,193,094,030,165,
460,090,325,932,474,143,661,505,758,330,944,415,515,994,
945,349,100,113,181,687,417,345

One solution is to use a BigInteger class that can handle arbitrarily large integers. I coded up this approach for binomial probabilities in an earlier blog post.

However, there is a traditional alternative that was the main option in the days before computers — use the Normal distribution to approximate the binomial distribution.

The basic idea is easy but there are a ton of interrelated details. 1.) There are ways to calculate the approximate area under the standard Normal (aka Gaussian) curve from -infinity to some value z.

2.) The binomial distribution for k successes in n trials with p the probability of a success in a single trial, is fairly close to the Normal distribution where z = (k – mean) / sd, where mean = n * p and sd = sqrt(n * p * (1-p)).

3.) To account for the way the actual binomial distribution, which is rectangles, goes past the curve of the Normal distribution, you add 1/2 to k. This is called “continuity correction”.

I’ve left out pages and pages and pages of additional details.

Anyway, just for fun I coded up an implementation that approximates the binomial distribution using the Normal distribution, with C#. My approximation gives identical results to the R language approximation.

Very interesting little exercise.