Training a Deep Neural Network using Back-Propagation

Over the past few weeks I’ve been coding a deep neural network (DNN) from scratch, using the C# language. My most recent milestone was getting a back-propagation training method up and running.

I’ve coded back-prop for single-hidden-layer (i.e., non-deep) neural networks many times, so I wasn’t expecting too much trouble when coding back-prop for a DNN. But it was a lot trickier than I thought it’d be. There wasn’t any one thing that stumped me, but there were just a ton of details.

Anyway, when implementing back-prop, there are a few design choices. Conceptually, the choice of using mean squared error or cross-entropy error is a big decision, but in terms of implementation, the error function isn’t a big deal.

deepneuralnettraining

I coded my back-prop routing using “stochastic” training, as opposed to batch or mini-batch. Conceptually that’s not a big deal, but implementation-wise, it’s a lot of work to use batch or mini-batch.

Another implementation option is to use momentum or not. I used momentum, because you can always set the momentum term to 0.0 if you like.

Anyway, it was very good fun. Next up, I think I’ll implement batch and mini-batch training. Also, I’ll need to carefully go over the code because anything this complex almost certainly has a few (hopefully minor) bugs.

Advertisements
This entry was posted in Machine Learning. Bookmark the permalink.

3 Responses to Training a Deep Neural Network using Back-Propagation

  1. mosdeo says:

    Congratulations!
    But, does your DNN performance better than your 1-hidden-layer ?

  2. mosdeo says:

    I improve your 1-hidden-layer BP-NN to 2-hidden-layer, but learning need more epochs and bigger test error.
    So, I think over whether or not use ReLu replace tanh?

Comments are closed.