I wrote an article titled “Neural Network Momentum using Python” in the August 2017 issue of Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2017/08/01/neural-network-momentum.aspx

Momentum is a technique intended to speed up neural network training. Training a neural network is the process of determining the values of the weights and biases that essentially define the behavior of the network. The most common training algorithm is called back-propagation. Back-propagation is an iterative process which can take a very long time for complex neural networks.

The basic update for one weight is w = w + (-1 * lr * grad(w)). Put a bit differently:

delta = -1 * lr * grad(w) w = w + delta

In words, the new weight value is the old value plus -1 times a small learning rate constant time the current gradient value of the weight. Th learning rate is a small constant, perhaps 0.01 but is determined by trial and error. The gradient is the Calculus derivative (just a number like -2.34) where the sign tells you if the weigh needs to increase or decrease and the magnitude influences how much the weight changes in one update.

Adding momentum is very easy and is:

delta = -1 * lr * grad(w) w = w + delta + (mf * prev(delta))

In each weight update you add an additional term which is a momentum factor constant (typically something like 0.50) times the value of the delta from the previous update iteration.

I my article I go through the details of neural network momentum and give a complete demo program, written in Python, from scratch.