A few weeks ago, I started working on a project that involves Differential Privacy (DP). DP is extremely complex, but like most complex topics, understanding DP is mostly a matter of understanding many sub-topics. DP uses the mathematical Laplace probability distribution. I hadn’t used the Laplace distribution since my university teaching days, and so one of my first steps was to look at Laplace.
Whenever I think about an exotic probability distribution, I mentally compare the exotic distribution to the Gaussian (aka Normal, aka bell-shaped) distribution. A Gaussian distribution has a mean (u) and a standard deviation (sd). When graphed, the mean specifies the middle point of the graph and the standard deviation specifies the spread — almost all of the area under a Gaussian will be between -6*sd and +6*sd.
The curve of the graph is called the probability density function (pdf). The x-axis extends from minus infinity to positive infinity. The total area under the density function curve is 1.0. The specific value for x on the curve isn’t a probability — you find the probability between two x values. For example, for a Gaussian with mean = 0.0 and standard deviation = 1.0, if you generate a random x value, x could be any value, but the probability that x is between 0.0 and 1.5 is the area under the Gaussian pdf between x = 0.0 and x = 1.5, which is approximately 0.4332 or 43%.
The Laplace distribution is somewhat similar to the Gaussian distribution. A Laplace distribution has a “location” value (similar to a mean, given symbol u) and a “scale” value (similar to a standard deviation, given symbol b). The x values can be between minus infinity and plus infinity. The area under a Laplace distribution probability density function sums to 1.0 and if you generate a random x value, you find the probability that x is between two values by computing the area under the pdf. The Laplace distribution has a sharp peak instead of a rounded peak you see in a Gaussian distribution.
In the image above, the graph on the left is the Laplace distribution with u = 0.0 and b = 1.0. If you generate a random x value from this distribution, the probability that x is between 0.0 and 1.5 is the area under the pdf between 0.0 and 1.5. Each square in the graph is 1.0 wide and 0.10 high, so the area of one square is 1.0 * 0.10 = 0.10. There are about three squares under the curve between x = 0.0 and x = 1.5 so the probability that a randomly generated x is between 0.0 and 1.5 is about 0.300 (the actual probability is 0.3033).
A tiny Python language program that generates 10 random values from a Laplace distribution with u = 0.0 and b = 1.0. Notice most of the values (7 of them) are between -2 and +2 but a few values (the other 3) are extreme.
The graph on the right is the Laplace distribution with u = 0.0 and b = 2.0. Notice it is more spread out than the graph for b = 1.0.
So, how is the Laplace distribution used in Differential Privacy? Briefly, in DP when you query a dataset, instead of getting an exact answer, you get the exact answer plus a random noise value where the noise is generated from a Laplace distribution. For example, suppose a dataset of 100 employees has 26 employees with low salary, 60 employees with medium salary, and 14 employees with high salary. Suppose someone queries the dataset and asks, “How many high-salary employees are there?” If the system returned the exact value of 14, an unscrupulous person could use that information (in an extremely complex way) to determine information about a specific person in the dataset. But if the system returns the exact value 14 plus a random Laplace x, such as 2.3, the result of 16.3 gives the querying person some useful information (even if not completely accurate) without the risk of revealing personal information. Note: DP is crazy complicated and this example has dozens of details that I’ve left out.
It is possible to add noise to the result of a numeric query using the Gaussian distribution instead of the Laplace distribution. However, because the Gaussian is more spread out than the Laplace, noisy query results using Gaussian will be less accurate than noisy results using Laplace. Adding Gaussian noise does have a minor theoretical advantages over Laplace; for query results that are multi-valued vectors, Gaussian noise allows the use of either L1 or L2 sensitivity, while Laplace noise allows only L1 sensitivity.
A shadow box approximates reality. A shadow box is about the size of a picture (for example, roughly 12 inches wide and 24 inches tall and a couple of inches deep) that holds a 3D model of some sort. I like shadow boxes but they don’t seem to be as common as they used to be years ago. Left: A shadow box of the Super Mario Brothers video game (1985). Right: The living room from the old “The Brady Bunch” TV show (1969-1974) with the iconic stairs and horse statue (apparently — I’ve never seen an entire episode of that show).