I was giving a lecture at the tech company I work for and there was a question from one of the attendees about the probability density function (PDF) for a Gaussian (aka Normal, bell-shaped) distribution. Briefly, the area under the PDF between two x values is the probability that a randomly generated x will be between those two values. For example, for a Gaussian with mean = 0 and standard deviation = 1, the probability that a randomly generated x is between 0.0 and 1.0 is the area under the curve between 0.0 and 1.0 which is approximately 0.3413.
The PDF value at x = 1.0 is approximately 0.2420. A PDF value can be used to compare the relative likelihoods of two different x values. For example, the PDF at x = 2.0 is about 0.0540 so getting x = 1.0 is more likely than getting x = 2.0. PDF values are not probabilities.
The total area under a Gaussian distribution is 1.0 but a PDF value can be greater than 1.0 if the distribution is squished, meaning it has a very small standard deviation.
In machine learning, probably the most common task related to probability distributions is to generate x values from a Gaussian distribution. Computing a PDF value is less common and can be easily done using a program-defined function or the scipy norm.pdf() function. To compute the area under the curve between two values (that is, the probability x is between two values), you can use the scipy norm.cdf() function (cumulative density function).
The Gaussian distribution is also known as the Normal distribution because, well, it’s mathematically normal. Two un-normal math photos. Left: Teaching students about angles at a U.S. high school. Explains a lot. Right: The concept of infinity that’s not so infinite.
# gaussian_pdf_demo.py import numpy as np from scipy.stats import norm def my_pdf(x, u, sd): a = np.exp(-(u - sd)**2 / 2) b = np.sqrt(2 * np.pi) return a / b print("\nBegin Gaussian pdf() demo ") np.random.seed(1) print("\nSampling 5 values from N(0,1) ") for i in range(5): x = np.random.normal(loc=0.0, scale=1.0) print("x = %8.4f " % x) print("\nComputing pdf() for x = 1.0 ") y = norm.pdf(x=1.0, loc=0.0, scale=1.0) print("%8.4f " % y) y = my_pdf(x=1.0, u=0.0, sd=1.0) print("%8.4f " % y) print("\nEnd demo ")