Combining Logistic Regression Models by Averaging Their Weights

Suppose you want to make a logistic regression binary classification model, for example to predict if a hospital patient is male or female based on variables such as age and hospitalization history. And suppose your training data file is very large and won’t fit entirely into memory. You have several options. You can create a streaming data loader, which is quite a bit of work. Or you can split the one large file of training data into smaller files, create separate prediction models and then create a meta-handler that combines the outputs of the models into a final prediction. This approach is perhaps the most common, but it’s messy because you have to maintain separate models.

A third technique, which to the best of my knowledge hasn’t been thoroughly explored, is to split the training data, create separate models, then combine the models by averaging the weights and biases of the models.

A few days ago I did a small experiment where I tried this idea. It did not work well, but the results weren’t conclusive because I used a very small (200 items) dataset, and I trained the separate models using stochastic gradient descent (SGD), which has a lot of variability in resulting model weights and bias.

I created two logistic regression models, each using 100 data items. Then I combined the two models into a third model by averaging the weights and biases. The combined model gave 77.50% accuracy on a set of held-out test data — which is essentially the same result as a model trained on all 200 items (that model not shown).

Today I re-ran my experiment. I used the same small-ish data, but this time I trained the two separate models using L-BFGS instead of SGD. I was mildly surprised that this approach appeared to work quite well.

Before the combining experiment, I trained a model on all 200 items and got 75% accuracy. I trained two models, one on the first 100 data items, and a second model on the second 100 items. I fetched the two models’ weights and biases, averaged them, then created a third model using the averaged weights and bias. The combined model achieved 77.5% accuracy on the test data — even (very slightly) better than the model trained using all 200 data items.

As usual with machine learning experiments, these results aren’t conclusive. But the results suggest that it might be possible to create a logistic regression model using a very large file of training data by splitting the file, training separate models, and then creating a single model that combines the weights and biases of the separate models.


Combining machine learning models is one thing, but combining fashion models is another thing.

Left: Three models combined on the runway at an Italian fashion show. OK — works quite well.

Center: I’m going to stick my neck out here and say this approach to combining fashion models at a Paris show is a solid “no”.

Right: Can someone explain these combined models at a Milan fashion show to me? On second thought, no, please don’t. I don’t think I want to know.

This entry was posted in PyTorch. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s