Support Vector Machines (SVM) using the R Language

A Support Vector Machine (SVM) is a machine learning algorithm that can do classification. SVMs were all the rage a few years ago but they’ve fallen out of favor a bit recently (deep neural networks are the current craze).

In most cases with ML algorithms, I like to code my own implementation so I have total control, and the custom implementation is typically at least one order of magnitude smaller in size than a library or tool.

However, SVMs are brutally difficult to code from scratch. The basic ideas are actually quite simple (relatively) but there are a huge number of details.

I thought I’d take a look at the svm() function in the R language. It’s really, really good.

rlanguage_svm_example

First I created a text file with the classic Iris data set. Then I installed the weirdly-named “e1071” R package that contains the svm() function. I loaded my data into a data frame:

mydf = read.table("IrisData.txt", header=T,sep=",")

Then I created the SVM model using the default radial kernel with its default parameter values:

x = subset(mydf, select= -species)
y = mydf$species
mymodel = svm(x, y)  # use all defaults
mymodel

Here x holds the predictor values (“all but species”) and y holds the categorical values to predict (Setosa, Versicolor, Virginica). Then I evaluated the predictive accuracy of the model:

mypred = predict(mymodel, x)  # generate predicted species
table(mypred, y)  # show predictions

Very slick. The R language is often like this: some very difficult tasks are easy to perform (but some easy tasks are very tricky to perform).

This entry was posted in Machine Learning, R Language. Bookmark the permalink.