Yet Another Ignorant Misuse of Statistics

Repeat after me: correlation does not mean causation. A recent headline from the Chicago Tribune caught my eye, “Chicago Area Pays Steep Price for Segregation Study Finds”.

Briefly, the article points out that Chicago is highly segregated, but that less segregated cities have lower murder rates, higher productivity, and higher incomes for African Americans. And then concludes that forcing desegregation in Chicago would therefore lower murder rates, increase productivity, and increase the incomes of African Americans. Huh?!

The writer obviously didn’t take a basics statistics class in her college days. Just because lower rates of segregation correlate with lower murder rates in some cities, that doesn’t mean that changing the segregation characteristics of Chicago would have any effect at all. In fact you could just as easily argue the desegregation could increase murder rates in Chicago. Correlation does not mean causation.

For example, from 1999 to 2009, U.S. crude oil imports from Norway correlated almost perfectly with the number of U.S. drivers killed in collisions with trains. Does this mean that we can reduce car-train fatalities by reducing oil imports from Norway?

The most obvious explanation for the Chicago data is that there are underlying latent variables (such as the high rate of African American children who are being raised by single parents) which cause both segregation and higher murder rates and lower incomes.

source: U.S. Centers for Disease Control and Prevention

Oh, by the way, the study which produced these results cost the city of Chicago $500,000. So I see that high-priced government studies and poor newspaper writing correlate with high murder rates. So murder rates in Chicago can be reduced by eliminating stupid studies? No.

This entry was posted in Machine Learning. Bookmark the permalink.