I usually use Excel to create my graphs but sometimes I’ll use R, especially when I’m using R anyway for some sort of data analysis.
So this morning I wanted to make a simple scatter plot based on this data stored in a text file:
Age,Edu,Party 1.0,4.0,red 5.0,8.0,red 3.0,7.0,red 2.0,5.0,red 6.0,7.0,red 3.0,2.0,blue 7.0,5.0,blue 4.0,5.0,blue 2.0,3.0,blue 4.0,7.0,blue
First I loaded the data into a data frame:
mydf <- read.table("AgeEduParty.txt", + header=T, sep=",")
Then I made a scatter plot:
plot(mydf$Age, mydf$Edu, xlim=c(0,9), + ylim=c(0,9), xaxs="i", yaxs="i", + col=c("red","blue")[mydf$Party], + pch=20, cex=2)
I use xlim and ylim to explicitly set the range for the x and y axes because R doesn’t do a very good job with default values. The mysterious xaxes and yaxes parameters force the graph to cross at (0,0) in the lower left corner — I hate the way R adds some extra space by default.
Setting the two colors is sort of a magic R incantation. The pch (“plot character”) value of 20 is a bullet because a dot doesn’t have a solid fill as I wanted. The cex (“character expansion”) makes the bullet twice as large as the default size.
Next, I fixed the axes intervals so that there were no gaps as there were by default:
axis(side=1, at=c(0:9)) axis(side=2, at=c(0:9))
And then I added grid lines:
abline(h=0:9,v=0:9, col="gray", lty=3)
I quit there but I could have added a legend, title, and all kinds of other options.