One of the most interesting mathematical curiosities I know of is called Simpson’s Paradox. Briefly, data in a table can lead to two opposite conclusions depending on how the data is presented.
Suppose you are looking at job application and job offer rates at some company, organized by males and females, so you can decide if the company is discriminating against women, as of course they probably are. The company has two departments, Administration and Production.
The hiring data for the entire company is:
Hired / Applicants Male 205/650 = 0.32 Female 150/500 = 0.30
So 32% of males who applied were hired, and only 30% of females who applied were hired. The government looks at the data, sues the company, and executives immediately order all managers to take diversity and gender training, and to increase the hiring of women.
But the exact same data, broken down by department is:
Administration Production Male 5/50 = 0.10 200/600 = 0.33 Female 100/400 = 0.25 50/100 = 0.50
So, in Administration, the percentage of women who are hired is more than twice that of men. And in Production, the percentage of women hired is also much greater than that of men (50% to 33%).
In short, the grouped data shows men are hired at a greater rate than women, but the ungrouped data shows that women are hired at a greater rate than men in all departments!
(For this example, the ungrouped data is a more accurate profile of the company – men in fact are at a hiring disadvantage.)