Support Vector Machine (SVM) classification is a machine learning technique that can be used to make a binary prediction — that is, one where the thing-to-predict can be just one of two possible values. For example, you might want to predict if a person is a Male (-1) or Female (+1) based on predictor variables such as age and annual income.
SVM classification was popular in the late 1990s but SVM isn’t used very much anymore, and now neural networks are used much more often.
SVMs require you to specify a kernel function (such as RBF), any parameters that the kernel function needs (gamma in the case of RBF), a value for C which controls how much noise the SVM will tolerate when training. SVMs tend to be very sensitive to the values of these hyperparameters so training an SVM can be a pain.
Researchers were getting all excited about SVMs in the 1990s because the math is very elegant. But realistically, much of that research work was a solution in search of a problem.
Just for kicks, I decided to refresh my memory with regards to SVMs. I coded up an implementation using raw Python. My two main resources were the excellent Web site by Alexandre Kowalczyk at https://www.svm-tutorial.com/2014/11/svm-understanding-math-part-1/ and a research paper by John Platt that describes how to train an SVM at http://luthuli.cs.uiuc.edu/~daf/courses/optimization/Papers/smoTR.pdf.
Coding up an SVM implementation is very challenging and it took me over a day, but I did learn some valuable coding tricks and learned several new things about machine learning.
In the demo below, I have data that has a circular geometry. So, I used an RBF kernel with gamma = 1.0. I set the C value to 10.0. The demo model correctly predicts all 21 of the data points.
Moral of the story: You really don’t want to code an SVM classifier from scratch unless you absolutely have to, or unless you really want to understand SVMs. And for binary classification in general, you’re probably better off using a neural network.