Neural Network Train-Validate-Test

I recently monitored several online message thread about the train-validate-test process. I was rather surprised by the amount of incorrect information I saw. Allow me to explain . . .

The purpose of the train-validate-test technique is to identify when overfitting occurs during training, so you can quit training (“early stopping”). You take your source data, and randomly split the data into three sets: a training set (typically 60% of the items), a validation set (typically 20%), and a test set (the remaining 20%).

During training, you train using only the training data. But during training, once every few training epochs, you evaluate the loss/error and accuracy of the current model on the validation data. Then when training is finished, you evaluate the model loss/error and accuracy on the test data.

In theory, as you train, the loss/error on the training data will steadily decrease over time. The loss/error on the validation data will initially decrease but at some point will begin to increase when overfitting is starting to occur. The loss/error and prediction accuracy on the test data is a rough estimate of what you’d see on new previously unseen data.

This is all great in theory, but in practice it rarely works. The problem is that loss/error isn’t nice and smooth as shown in my graph. For real data, the graphs jump around wildly and it’s very, very difficult to tell when overfitting is starting to kick in.

I’ve seen all kinds of incorrect statements about how train-validate-test works. For example, I sometimes see Keras code like this:

h = model.fit(train_x, train_y, batch_size=bat_size, verbose=0,
  epochs=max_epochs, validation_data=(test_x,test_y),
  callbacks=[my_logger])

Sure, you can use test data for validation but that misses the whole point of the technique. And I frequently saw messages where people stated that validation is required. No, it’s not. In fact, the train-test-validation technique is rarely used anymore, especially with deep neural networks.

The moral of the story here is that you shouldn’t blindly believe everything you read on the Internet. And that includes my posts too!


This entry was posted in Machine Learning. Bookmark the permalink.