A few days ago I did some thought experiments about different schemes to automatically stop training a logistic regression model. I was motivated by the poor performance of the sckit library LogisticRegression model with default parameters.
After quite a bit of thought, I decided to stop based on average squared error per training item. It’s best explained by example. Suppose your training data has just three items and the variable to predict, sex, is encoded as 0 = male, 1 = female, and each item has just two predictor values, age and income:
age income sex ---------------- 0.39 0.54000 0 0.28 0.32000 1 0.40 0.64000 0
Now suppose that at some point during training the model’s computed outputs are [0.20, 0.60, 0.30]. The squared error terms for each of the three data items are:
(0 - 0.20)^2 = 0.04 (1 - 0.60)^2 = 0.16 (0 - 0.30)^2 = 0.09
The average squared error for the three items is (0.04 + 0.16 + 0.09) / 3 = 0.097. I call this average squared error to distinguish from mean squared error that’s used as the loss function for training. Both metrics give the same value.
To make a long story short, after a bit of experimentation, setting an auto-stop condition of average squared error less than 0.20 seemed to work pretty well.
Below: Photoshop artist James Fridman accepts requests from his fans to “fix” their photos. Fridman, hilariously, never knows when to stop.