Simpson’s Paradox

One of the most interesting mathematical curiosities I know of is called Simpson’s Paradox. Briefly, data in a table can lead to two opposite conclusions depending on how the data is presented.

Suppose you are looking at job application and job offer rates at some company, organized by males and females, so you can decide if the company is discriminating against women, as of course they probably are. The company has two departments, Administration and Production.

The hiring data for the entire company is:

              Hired / Applicants
Male         205/650 = 0.32
Female       150/500 = 0.30

So 32% of males who applied were hired, and only 30% of females who applied were hired. The government looks at the data, sues the company, and executives immediately order all managers to take diversity and gender training, and to increase the hiring of women.

But the exact same data, broken down by department is:

           Administration    Production
Male         5/50 = 0.10     200/600 = 0.33
Female    100/400 = 0.25      50/100 = 0.50

So, in Administration, the percentage of women who are hired is more than twice that of men. And in Production, the percentage of women hired is also much greater than that of men (50% to 33%).

In short, the grouped data shows men are hired at a greater rate than women, but the ungrouped data shows that women are hired at a greater rate than men in all departments!

(For this example, the ungrouped data is a more accurate profile of the company – men in fact are at a hiring disadvantage.)

This entry was posted in Miscellaneous. Bookmark the permalink.