The beta distribution pops up from time to time in my work with machine learning. Many people are familiar with the Gaussian (also called normal, or bell-shaped) distribution. A particular Gaussian distribution is characterized by a mean and a standard deviation which determine the behavior of the distribution. Each set of (mean, sd) values determine a different Gaussian distribution. For example, if mean = 0.0 and sd = 1.0 then if you draw many sample values (usually called z) from the distribution, you’d expect about 68% of the z values to be between -1.0 and +1.0 and about 95% of the z values to be between -2.0 and +2.0, and so on.
The beta distribution also has two characteristic values, usually called alpha and beta, or more succinctly, just a and b. Each set of (a,b) pairs determine a different beta distribution. When you sample from beta(a,b) each sample value (I usually call them p values) will be between 0.0 and 1.0 and if you sample many values they will average to a / (a+b). For example, if you sample many values from beta(3, 1), each value will be between 0.0 and 1.0 and all the values will average to about 3/4 = 0.75.
The NumPy add-on package for the Python language has a built-in beta() function. For example:
import numpy as np np.random.seed(0) for i in range(1000): p = np.random.beta(3,1) print(p)
will generate 1,000 p-values between 0.0 and 1.0 that average to about 0.75. One of my character flaws is that I’m never completely happy using functions from a code library unless I completely understand the function. And that means I want to be able to implement the function from scratch.
After a bit of research, I found a 1978 research paper titled “Generating Beta Variates with Nonintegral Shape Parameters” by R. C. H. Cheng. The paper provided a basic (meaning somewhat inefficient for 1970s era computers) algorithm. So, I coded up the algorithm using raw Python. I generated 10,000 samples from beta(3,1) and compared the results to the beta() function in the NumPy library and got the same results. I was happy about that.
You must be logged in to post a comment.