Machine Learning Ensemble Model Averaging

I heard a very good talk about machine learning at the 2016 SAS Analytics Conference. I forget the exact title but the speaker was Brett Wujek from SAS.

There were many nice points in Brett’s talk. One of them described different forms of ensemble techniques. The most powerful ensemble technique is to create two or more prediction models and then average them.

Take a look at the graph below. The red dots represent training data with known input and output values. For example, if the input is X = 17 then the actual output is Y = 5.


The blue curved line represents what you might get if you use a neural network and the model over-fits (which is typical). Notice the blue curve is very accurate for the training data (close to the red dots) but the model is too good in a sense and predicts poorly for, say X = 4.

The straight green line represents what you might get if you use logistic regression and the model under-fits (which is typical). For most training data points, you get a prediction that’s OK but not great.

But if you average the two models you get the dotted yellow line. Notice it’s a pretty good model and overall fits the data really well.

Many, but not all, of my colleagues are very strong proponents of ensemble techniques for machine learning. The main downside to ensemble techniques is increased complexity.

This entry was posted in Machine Learning. Bookmark the permalink.