I wrote an article titled “Gradient Descent Training Using C#” in the March 2013 issue of the Microsoft MSDN Magazine. See https://msdn.microsoft.com/en-us/magazine/dn913188.aspx.
Gradient descent is an idea that’s simple once you understand it, but hard to grasp at first. In many machine learning (ML) problems, you need to find the values for a set of weight variables so that some measure of error is minimized. The measure of error will vary as the value of a weight changes. In calculus, a quantity called the partial derivative, is a measure that indicates how much and in what direction (+ or -) a weight value should change in order to lessen error.
The set of all partial derivatives (one for each weight) is called the gradient. Notice that the term “gradient” is singular but it has multiple components. For simplicity, each partial derivative is often called a gradient even though that’s not technically correct.
Gradient descent adjusts weights so that error is lessened. Graphically, you move down (descent) an error curve using the partial derivative (gradient).
In the MSDN Magazine article, I show how to use gradient descent to find the weight values for a logistic regression problem that predicts a binary result (true or false) for some synthetic data.