## Neural Network L1 Regularization using Python

I wrote an article titled “Neural Network L1 Regularization using Python” in the December 2017 issue of Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2017/12/05/neural-network-regularization.aspx.

The most difficult part about L1 regularization is understanding what problem it solves. A full explanation of the problem scenario would take several pages — so, very, very briefly. . .

You can think of a neural network as a very complex math equation that makes a prediction. A neural network was numeric constants called weights (and special weights called biases) that determine the output of the network. Determining the values of the NN weights is called training the network. To train the network, you use a set of training data that has known input values, and known correct output values. You use an optimization algorithm (usually back-propagation) to find values for the weights so that when presented with the inputs in the training data, the computed output values closely match the known, correct output values. (Whew. Believe me, I’ve left out a ton of detail).

A major problem with NN training is called overfitting. This means that if you train the network too well, the network will be very accurate on the test data, but when presented with new, previously unseen data, the trained network predicts poorly.

There are several techniques that can be used to deal with overfitting. One of these techniques is called regularization. There are two main forms of regularization, called L1 and L2. Again, a full explanation of L1 regularization would take many pages so . . .

Overfitting is often characterized by weight values that are very large. L1 regularization uses tricky math during the training process to limit the magnitudes of the network’s weights.

My article shows exactly how you’d go about doing this. In most cases you wouldn’t want to code a NN from scratch. But for me at least, the only way to fully understand what is going on with machine learning libraries like CNTK, TensorFlow, and Keras is to see how the code works.