Deep Neural Network Training Batch vs. Online

I’ve been getting my butt kicked, technically speaking, for the past couple of days. I’ve been exploring training deep neural networks. When I use standard online training with back-propagation, my code seems to work fairly well:

My code creates 2,000 dummy items. Each item has four inputs and three outputs and looks like (4.5, -3.2, 1.6, -2.0, 0, 0, 1). The generator uses a 4-(10,10,10)-3 deep NN — four inputs, three hidden layers of ten nodes each, and three outputs. Therefore, the generator has (4 * 10) + (10 * 10) + (10 * 10) + (10 * 3) + 30 + 3 = 303 weight and biases that must be determined.

One of the points of my investigation is to explore the vanishing gradient phenomenon. In the image above I display one gradient every 200 training epochs and you can see that, as expected, it quickly goes to nearly 0 (to four decimals).

So, just for fun I thought I’d see what the effect of using batch training would be:

What the heck?! The NN just doesn’t learn at all. Now I know that online training is better than batch training, but this result is extreme. I suspect I may have a bug in my batch-training version code. But, tracking down a problem in code like this could easily take days so I’m going to have to put it aside for now. Grrr.

This entry was posted in Machine Learning. Bookmark the permalink.

2 Responses to Deep Neural Network Training Batch vs. Online

  1. peterboos says:

    Hm wel i have notices something too about your neural networks.
    it seams that the initial seeding with random values can have great effect.
    ea a 3:5:3 network keeping all data equal, produces always the same result.
    But i modified the MakeTrainTest, to allow me to set a starting seed value >>100.
    And also i can set ratio between traindata and validation data.
    My function looks like : MakeTrainTest(allData, out trainData, out testData, 0.8, 100);
    The effect of that is almost as huge as using slightly different network ea 3:6:3

    inside MakeTrainTest :
    static void MakeTrainTest(double[ ][ ] allData, out double[ ][ ] trainData, out double[ ][ ] testData, double trainPct = 0.8, int seed = -1)
    int seeder;
    if (seed < 0) seeder = (int)System.DateTime.Now.Ticks; else seeder = seed;
    Random rnd = new Random(seeder);

Comments are closed.