## The Chi-Square Test in Software Testing

Several of my blog entries here have described various mathematical techniques that are really useful in software testing. One of the most very useful techniques is the chi-square test for goodness of fit. This test applies to many situations, but is not often used in practice. I suspect this is because most of the testers I know do not understand the chi-square test and therefore do not recognize situations when the test is useful. The chi-square test can be used to determine how well a set of actual results match a corresponding set of expected results. Here’s an example. Suppose you have some software system that is supposed to randomly spit out a total of 100 of the 5 letters ‘A’ through ‘E’. Therefore, you’d expect to get 20 of each letter, subject to a certain amount of variation. Suppose you run the system and get this actual data:

A = 12, B = 10, C = 20, D = 30, E = 28

The chi-square statistic is the sum of the squared differences between each observed and expected pair of numbers divided by the expected number. In this case the chi-square statistic is (12-20)2/20 + (10-20)2/20 (20-20)2/20 + (30-20)2/20 + (28-20)2/20 = 3.2 + 5.0 + 0.0 + 5.0 + 3.2 = 16.4. The number f degrees of freedom for chi-square is just k-1, the number of categories minus one, in this case df = k-1 = 5-1 = 4. Now we can look up the 95% critical value from any stats book and find it is 9.49. Because our calculated chi-square value of 16.4 is greater than the critical value of 9.49, we conclude that the software system is not performing as it should — there is less than a 5% chance we’d get the observed data if the system is actually spitting out evenly distributed letters. (Excel can do chi-square — see the image below — the .000324 is the probability that the observed numbers match the expected numbers). There’s a lot more to chi-square but once you realize the situations in which chi-square can be used, you’ll be surprised at just how useful chi-square is in software testing.