## Multi-Class Logistic Regression Classification

I wrote an article “Multi-Class Logistic Regression Classification” in the April 2015 issue of Microsoft MSDN Magazine. See https://msdn.microsoft.com/en-us/magazine/dn948113.aspx.

The goal of a multi-class logistic regression problem is to predict something that can have three or more possible discrete values. For example, you might want to predict the political inclination (conservative, moderate, liberal) of a person based on predictor variables such as their age, sex, annual income, and so on.

Regular logistic regression (LR) predicts something that can have one of just two possible values. For example, predicting the sex (male, female) of a person. Regular logistic regression is one of the most basic forms of machine learning. In regular LR, one of the two possible predictions is arbitrarily assigned a 0 and the other possible value is assigned a 1. For example, male = 0 and female = 1. The input values produce a single numeric output value between 0.0 and 1.0. If the output value is less than 0.5 (i.e., closer to 0) you predict the 0 result (male). If the output value is greater than 0.5 (i.e., closer to 1) you predict the 1 result (female).

Multi-class logistic regression extends this idea. If there are three possible classes in the thing to predict, for example, (conservative, moderate, liberal) then the input values produce three numeric values that sum to 1.0. For example, the output values might be (0.20, 0.70, 0.10). The predicted class is the one which corresponds to the largest output value (moderate).

Multi-class logistic regression isn’t used very much. I suspect this is because writing the code for multi-class LR is quite a bit trickier than for regular LR.