Machine Learning, Data Science, and Statistics

There are no universally agreed-upon definitions for the terms “machine learning”, “data science”, and “statistics”. In my mind, classical statistics consists of traditional techniques that were developed from the 1920s through the 1970s. Statistics techniques include things like correlation, linear regression, and the t-test for hypothesis testing.

In my mind, machine learning consists of techniques that make predictions based on data and usually require computer analysis. Examples include logistic regression classification, neural network classification, and k-means clustering.


In my mind, data science is a general term that includes classical statistics, machine learning, and other topics such as database theory and practice.

And in my mind, artificial intelligence is a term that refers to systems that loosely mimic human behavior. Topics include speech recognition and pattern (visual) recognition.

I recently sat in on an interesting talk at the SAS Analytics conference. The talk was a general overview of machine learning and was given by Brett Wujek of SAS. The talk had a PowerPoint slide that attempted to illustrate the relationships between various terms.

On the one hand, an exercise like this is somewhat futile because it’s very subjective. But on the other hand, it’s an interesting attempt to clarify relationships.

This entry was posted in Machine Learning. Bookmark the permalink.