Fundamentals of T-Test using R

I wrote an article titled “Fundamentals of T-Test using R” in the February 2016 issue of Visual Studio Magazine. See


R is a scripting language, plus an interactive environment, plus a large library of built-in functions. R is used mostly for data analysis. Interest in R has increased significantly. I suspect the increase in interest may be due to Microsoft’s recent acquisition of the Revolution R product and it rebranding as Microsoft R Server.

In my article I show how to use R to perform a t-test. When I wrote the article, the most difficult part was explaining exactly what the t-test is. I explained by using an example:

Imagine that you work for a very large school and want to investigate the difference in mathematical ability of the male students in a certain grade versus the female students. Because the math ability exam is time-consuming and expensive, you can’t give the exam to all of the students. So you randomly select a sample of the males and a sample of the females and administer the exam to the two groups.

The t-test calculates the mean score of each of the two samples and compares the two sample means to infer if the two means of the parent populations (all male students and all female students) are probably the same or not.

I conclude my article with a caution:

Interpreting the results of a t-test is a very delicate process. It’s important to remember that the t-test is probabilistic and applies to groups, not individual items. My colleagues and I who work with statistics are often dismayed by completely incorrect causation conclusions drawn from statistics.

This entry was posted in Machine Learning. Bookmark the permalink.