A few days ago, a work colleague, Clifford D., pointed me to an excellent video he made about the Beta-Binomial distribution. (Unfortunately, as far as I know, the video is not publicly available).

The regular Binomial distribution can answer questions like, “Assuming a baseball player’s batting average is 0.300 what is the probability that he will get 4 hits in the next 5 at-bats? And then after that, what is the probability he will get 3 hits in the next 6 at-bats?” The two calculations use a fixed value of p = 0.300 which doesn’t change.

The Beta-Binomial distribution can answer questions like, “Assuming a baseball player currenty has 3 hits in 10 at-bats, what is the probability he will get 4 hits in the next 5 at-bats? And then after that, what is the probability he will get 3 hits in the next 6 at-bats?” In this case the first calculation uses a = 3 successes and b = 7 failures. But the second calculation uses updated number of successes and failures in the previous 11 at-bats.

*Beta-Binomial implemented using C#*

Put another way, the Beta-Binomial distribution can answer the general question, “Given that there have been ** a** successes and

**failures at a particular point in time, what is the probability of getting k successes in the next n trials?”**

*b*Just for fun, I decided to implement the Beta-Binomial equation using C#. The equation for the regular Binomial distribution is:

In words, this is, “Given that the probability of a success on a single trial is p, the probability of getting exactly k successes in n trials.”

The Wikipedia entry on the Beta-Binomial distribution gives several versions of the equation for the distribution. Here are three variations:

The top equation is expressed in terms of the Beta function B and is more or less the definition equation, but it’s very difficult to compute directly. The second equation is based on deep mathematical relationships between the Beta function and the Gamma function. The Gamma function is also difficult to compute but the log-Gamma function is manageable. So, by applying the log() function to the top and bottom of the second equation for Beta-Binomial, and using the facts that log(x * y) = log(x) + log(y) and log(x / y) = log(x) – log(y) you get the third equation.

Although the third equation is long, it’s not difficult assuming you can compute log-Gamma. Luckily, I had implemented the LogGamma() function in C# (see https://jamesmccaffrey.wordpress.com/2013/06/19/the-log-gamma-function-with-c/) and so coding a Beta-Binomial demo was quite simple. In my demo program, I computed that the probability of 5 successes in the next 10 trials given that there were 4 successes and 3 failures in the previous 7 trials is 0.1469.

Neat!

*Beta particle decay – cool. Betamax video – defunct. Beta male doing housework – sad. Lancia Beta – beautiful but unreliable.*

You must be logged in to post a comment.