Chi-Squared Goodness of Fit using C#

I wrote an article titled “Chi-Squared Goodness of Fit using C#” in the March 2017 issue of Microsoft MSDN Magazine. See https://msdn.microsoft.com/en-us/magazine/mt795190.

A chi-squared goodness of fit test is most often used when you have a dataset of “count” values and you want to see if those count values correspond to some hypothesized counts. For example, in my article, I describe a problem where you are looking at an (American style) roulette wheel that you suspect might be biased.

chisquareddemorun

A roulette wheel has 38 slots: 18 red, 18 black, and 2 green. If the wheel is fair and you rolled a ball 1,000 times, you’d expect to get (474, 474, 52) — 18 / 38 * 1000 = 474 red results, and 18 / 38 * 1000 = 474 black results, and 2 / 38 * 1000 = 52 green results.

But suppose you actually perform an experiment and instead of getting (474, 474, 52) you get (450, 460, 90) — fewer reds and fewer blacks than expected and more greens. Even if the wheel is fair, this could have happened by sheer luck. A chi-squared test gives you the probability of the observed results, and from this you can infer if the wheel is fair or not.

As I point out in my article, there are plenty of tool that can do a chi-squared test, including R and Excel. However, if you want to implement chi-squared directly in a software system you might want to write the code yourself. The key to implementing chi-squared is calculating an area under a chi-squared distribution. I show how to use a special algorithm named ACM 299 (which in turn uses ACM 209).

chisquareddist

Advertisements
This entry was posted in Machine Learning. Bookmark the permalink.

2 Responses to Chi-Squared Goodness of Fit using C#

  1. Great article as always. However, the code download for the article seems to be offline for some reason 😦

Comments are closed.