A standard regression problem is one where the goal is to predict a single numeric value, for example, predicting the median price of a house in a town. Problems where the goal is to predict two or more numeric values are relatively rare. For example, you might want to predict both the poverty rate and the median house price in a town.
Note: Somewhat unfortunately, there is no standard name for regression with multiple output values. The term “multiple regression” is sometimes used but that term more often means predicting a single output value using two or more input/predictor values. The term “multivariate regression” is sometimes used, but multivariate just means multiple variables and so it can refer to multiple inputs and/or multiple outputs.
I coded up a demo of a regression problem with two output values. I used the well-known Boston Area House Price Dataset. The dataset has 14 columns:
[0] = crime rate / 100, [1] = pct large lots / 100, [2] = pct business / 100, [3] = adj to river (-1 = no, +1 = yes), [4] = pollution / 1, [5] = avg num rooms / 10, [6] = pct built before 1940 / 100, [7] = distance to boston / 100, [8] = access to highways / 100, [9] = tax rate / 1000, [10] = pupil-teacher ratio / 100, [11] = density Blacks / 1000, [12] = pct low socio-economic / 100, [13] = median house price / 100_000
Each item represents one of 506 towns near Boston. I normalized the raw data by dividing each column by a constant so that all values (except for the Boolean [3]) are between 0.0 and 1.0. I split the data into a 400-item training set and a 106-item test set.
The usual goal is to predict the median house price in a town [13] from the other variables. For my multiple-output demo I decided to predict both poverty [12] and price [13] from the other 12 variables.
First, I implemented a program-defined Dataset class to serve up 12 predictors and 2 outputs. That was not trivial. (See the complete code below).
Next, I designed a 12-(10-10)-2 neural network. I didn’t apply any activation on the output nodes, but because all normalized poverty and price values are between 0 and 1, I could have used sigmoid() or relu() activation — I didn’t experiment with those options.
For training, I used L1Loss() which is mean absolute error (MAE). I could have used MSELoss() but MSE severely penalizes outliers because of the squaring operation and so L1Loss() seemed like a better choice. I didn’t try MSELoss().
I implemented a program-defined accuracy() function. For a prediction to be correct, I specified that both the predicted poverty and predicted price must be within 20% of the true target poverty and price.
A fun and interesting project.
When I was a college student at UC Irvine in California, I worked at Disneyland in Anaheim. One of the rides I worked on was the Haunted Mansion. The main job was to make sure people didn’t trip when getting into the “Doom Buggy” in the dimly light entrance area. The Mansion has some cool effects, including Madam Leota in a crystal ball. The Mansion cost $7 million dollars to construct over several years in the 1960s — long delays due to Walt Disney’s death in 1966.
Demo code. Replace “lt” with less-than Boolean operator. The training and test data can be found at https://jamesmccaffrey.wordpress.com/2022/05/09/the-boston-area-house-price-problem-using-pytorch-1-10-on-windows-10-11/.
# boston_dual_regression.py # Boston Area House Dataset # predict price and poverty rate # PyTorch 1.12.1-CPU Anaconda3-2020.02 Python 3.7.6 # Windows 10/11 import numpy as np import torch as T device = T.device('cpu') # ----------------------------------------------------------- class BostonDataset(T.utils.data.Dataset): # features in cols [0,11], poverty in [12], price in [13] def __init__(self, src_file): all_xy = np.loadtxt(src_file, usecols=range(0,14), delimiter="\t", comments="#", dtype=np.float32) self.x_data = T.tensor(all_xy[:,0:12]).to(device) self.y_data = \ T.tensor(all_xy[:,[12,13]]).to(device) def __len__(self): return len(self.x_data) def __getitem__(self, idx): preds = self.x_data[idx] price = self.y_data[idx] return (preds, price) # as a tuple # ----------------------------------------------------------- class Net(T.nn.Module): def __init__(self): super(Net, self).__init__() self.hid1 = T.nn.Linear(12, 10) # 12-(10-10)-2 self.hid2 = T.nn.Linear(10, 10) self.oupt = T.nn.Linear(10, 2) T.nn.init.xavier_uniform_(self.hid1.weight) T.nn.init.zeros_(self.hid1.bias) T.nn.init.xavier_uniform_(self.hid2.weight) T.nn.init.zeros_(self.hid2.bias) T.nn.init.xavier_uniform_(self.oupt.weight) T.nn.init.zeros_(self.oupt.bias) def forward(self, x): z = T.tanh(self.hid1(x)) z = T.tanh(self.hid2(z)) z = self.oupt(z) # no activation, aka Identity() return z # ----------------------------------------------------------- def train(model, ds, bs, lr, me, le): # dataset, bat_size, lrn_rate, max_epochs, log interval train_ldr = T.utils.data.DataLoader(ds, batch_size=bs, shuffle=True) loss_func = T.nn.L1Loss() # mean avg error optimizer = T.optim.Adam(model.parameters(), lr=lr) for epoch in range(0, me): epoch_loss = 0.0 # for one full epoch for (b_idx, batch) in enumerate(train_ldr): X = batch[0] y = batch[1] optimizer.zero_grad() oupt = model(X) loss_val = loss_func(oupt, y) # a tensor epoch_loss += loss_val.item() # accumulate loss_val.backward() # compute gradients optimizer.step() # update weights if epoch % le == 0: print("epoch = %4d | loss = %0.4f" % \ (epoch, epoch_loss)) # ----------------------------------------------------------- def accuracy(model, ds, pct_close): # one-by-one (good for analysis) # assumes model.eval() n_correct = 0; n_wrong = 0 data_ldr = T.utils.data.DataLoader(ds, batch_size=1, shuffle=False) for (b_ix, batch) in enumerate(ds): X = batch[0] Y = batch[1] # target poverty and price with T.no_grad(): oupt = model(X) # predicted price if T.abs(oupt[0] - Y[0]) "lt" T.abs(pct_close * Y[0]) and \ T.abs(oupt[1] - Y[1]) "lt" T.abs(pct_close * Y[1]): n_correct += 1 else: n_wrong += 1 return (n_correct * 1.0) / (n_correct + n_wrong) # ----------------------------------------------------------- def main(): # 0. get started print("\nBoston dual output regression using PyTorch ") np.random.seed(0) T.manual_seed(0) # 1. create Dataset DataLoader objects print("\nLoading Boston train and test Datasets ") train_file = ".\\Data\\boston_train.txt" train_ds = BostonDataset(train_file) test_file = ".\\Data\\boston_test.txt" test_ds = BostonDataset(test_file) # 2. create model print("\nCreating 12-(10-10)-2 regression network ") net = Net().to(device) net.train() # 3. train model print("\nbatch size = 10 ") print("loss = L1Loss() ") print("optimizer = Adam ") print("learn rate = 0.005 ") print("max epochs = 5000 ") print("\nStarting training ") train(net, train_ds, bs=10, lr=0.005, me=5000, le=1000) print("Done ") # 4. model accuracy net.eval() acc_train = accuracy(net, train_ds, 0.20) print("\nAccuracy on train (within 0.20) = %0.4f " % acc_train) acc_test = accuracy(net, test_ds, 0.20) print("Accuracy on test (within 0.20) = %0.4f " % acc_test) # 5. TODO: save model # 6. use model print("\nPredicting normalized (poverty, price) first train") print("Actual (poverty, price) = (0.0914, 0.2160) ") x = np.array([0.000273, 0.000, 0.0707, -1, 0.469, 0.6421, 0.789, 0.049671, 0.02, 0.242, 0.178, 0.39690], dtype=np.float32) x = T.tensor(x, dtype=T.float32) with T.no_grad(): oupt = net(x) print("Predicted poverty price = %0.4f %0.4f " % \ (oupt[0], oupt[1])) print("\nEnd demo ") if __name__=="__main__": main()
You must be logged in to post a comment.