I consider logistic regression to be the “Hello World” of machine learning. I gave a talk to a group of (mostly) engineers, where I explained what logistic regression (LR) is, explained how LR works, and described different ways to perform an LR analysis.
1. Logistic regression is a technique for binary prediction. For example: predict if a person is a political conservative (0) or liberal (1) based on their age, income, years of education.
2. Logistic regression is basically a prediction equation with special constants called weights and biases.
3. The prediction equation is p = 1.0 / (1.0 + e^-z) where z = b0 + (b1)(x1) + (b2)(x2) + (b3)(x3) = . . .
The b0 is the bias, the b1 through bn are the weights, and the x1 trough xn are the input predictor values. The p value will always be between 0 and 1. If p is less than 0.5 the prediction is class “0” otherwise the prediction is class “1”.
4. Determining the values of the weights and biases for an LR model is called training. There are many training algorithms including gradient ascent to maximize log-likelihood (the most common), gradient descent to minimize squared error (most similar to neural network training), iterated Newton-Raphson, L-BFGS, and swarm optimization.
5. You can do an LR analysis by writing raw code (my preferred technique because you get total control), use the R language glm() function (very easy), use the Azure Machine Learning service, use the internal-Microsoft TLC library (the engine underneath Azure ML), or use the CNTK code library (most useful for enormous datasets).
During my talk, I was reminded how difficult it can be to explain something simple. There’s a delicate balance between being too brief or presenting too much background information and extra details.
But there’s no better way for me to be sure I completely understand a machine learning topic than to give a talk on the topic.