I’ve been exploring the idea of training a PyTorch neural network using an evolutionary algorithm. The basic idea is to create a population of solutions (here, a set of neural weights and biases) and then repeatedly combine two solutions to create a better child solution.

Conceptually, the ideas are quite subtle. The two main points of using PyTorch are 1.) use GPU tensors for speed, 2.) use built-in gradient computation for back-propagation training. But an evolutionary algorithm doesn’t use gradients. However, it still kind of makes sense to use PyTorch for an evolutionary approach because I can leverage the enormous PyTorch infrastructure which includes DataLoader objects, and so on. For my demo, I created a 6-(10-10)-3 neural network where I imagine the goal is to predict Employee job-type (one of three) from sex, age, city (one of three) , and income.

```import torch as T
class Net(T.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.hid1 = T.nn.Linear(6, 10)  # 6-(10-10)-3
self.hid2 = T.nn.Linear(10, 10)
self.oupt = T.nn.Linear(10, 3)

def forward(self, x):
z = T.tanh(self.hid1(x))
z = T.tanh(self.hid2(z))
z = T.softmax(self.oupt(z), dim=1) # note
return z
```

The hid1 layer weights are size [10,6] and the biases are . Similarly, hid2 weights and biases are size [10,10] and , and the oupt weights and biases are size [3,10] and . Therefore, there are a total of 60 + 10 + 100 + 10 + 30 + 3 = 213 weights and biases.

I wrote a function that accepts a tensor of 213 values and then distributes them to the network weights and biases:

```def load_weights(net, wts):
if len(wts) != (10*6) + 10 + (10*10) + 10 + (3*10) + 3:
print("FATAL: incorrect number wts in load_weights() ")

net.hid1.weight.data = wts[0:60].reshape((10,6))
net.hid1.bias.data = wts[60:70]
net.hid2.weight.data = wts[70:170].reshape((10,10))
net.hid2.bias.data = wts[170:180]
net.oupt.weight.data = wts[180:210].reshape((3,10))
net.oupt.bias.data = wts[210:213]
```

Important: The values are being copied into the weights and biases by reference rather than by value, so this technique (probably — I’m not sure) can’t be used for a standard PyTorch scenario where gradients are used during training.

I created a short program to create a network and load weights from 0.000 to 0.213. The demo seemed to work. Now I have the infrastructure I need to explore training the network using an evolutionary algorithm.

The demo program is a multi-class classification problem. In general, regression problems, where the goal is to predict a single numeric value, such as an SAT (Scholastic Aptitude Test) college readiness test score, tend to be more difficult than classification problems. I’m hopeful that evolutionary algorithms can improve accuracy on regression problems. SAT math scores are quite stable over time. The ability gap between groups has been consistent and large. Predicting SAT scores is not difficult. Note: the big jump in scores in 2017 was due to a change in the test, not a jump in ability.

Demo code. The data can be found at https://jamesmccaffrey.wordpress.com/2022/04/29/predicting-employee-job-type-using-pytorch-1-10-on-windows-11/.

```# employee_job_load_wts.py
# predict job type from sex, age, city, income
# PyTorch 1.12.1-CPU Anaconda3-2020.02  Python 3.7.6
# Windows 10/11

import numpy as np
import torch as T
device = T.device('cpu')  # apply to Tensor or Module

# -----------------------------------------------------------

class EmployeeDataset(T.utils.data.Dataset):
# sex  age    city      income  job-type
# -1   0.27   0  1  0   0.7610   2
# +1   0.19   0  0  1   0.6550   0
# sex: -1 = male, +1 = female
# city: anaheim, boulder, concord
# job type: mgmt, supp, tech

def __init__(self, src_file):
tmp_x = all_xy[:,0:6]  # cols [0,6) = [0,5]
tmp_y = all_xy[:,6]    # 1-D

self.x_data = T.tensor(tmp_x,
dtype=T.float32).to(device)
self.y_data = T.tensor(tmp_y,
dtype=T.int64).to(device)  # 1-D

def __len__(self):
return len(self.x_data)

def __getitem__(self, idx):
preds = self.x_data[idx]
trgts = self.y_data[idx]
return (preds, trgts)  # as a Tuple

# -----------------------------------------------------------

class Net(T.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.hid1 = T.nn.Linear(6, 10)  # 6-(10-10)-3
self.hid2 = T.nn.Linear(10, 10)
self.oupt = T.nn.Linear(10, 3)

def forward(self, x):
z = T.tanh(self.hid1(x))
z = T.tanh(self.hid2(z))
z = T.softmax(self.oupt(z), dim=1) # note
return z

# -----------------------------------------------------------

def accuracy_quick(model, dataset):
# assumes model.eval()
X = dataset[0:len(dataset)]
Y = T.flatten(dataset[0:len(dataset)])

oupt = model(X)
# (_, arg_maxs) = T.max(oupt, dim=1)
arg_maxs = T.argmax(oupt, dim=1)  # argmax() is new
num_correct = T.sum(Y==arg_maxs)
acc = (num_correct * 1.0 / len(dataset))
return acc.item()

# -----------------------------------------------------------

if len(wts) != (10*6) + 10 + (10*10) + 10 + (3*10) + 3:
print("FATAL: incorrect number wts in load_weights() ")

net.hid1.weight.data = wts[0:60].reshape((10,6))
net.hid1.bias.data = wts[60:70]
net.hid2.weight.data = wts[70:170].reshape((10,10))
net.hid2.bias.data = wts[170:180]
net.oupt.weight.data = wts[180:210].reshape((3,10))
net.oupt.bias.data = wts[210:213]

# -----------------------------------------------------------

def main():
# 0. get started
print("\nBegin PyTorch load weights demo ")
T.manual_seed(1)
np.random.seed(1)

print("\nCreating Employee Datasets ")

train_file = ".\\Data\\employee_train.txt"
train_ds = EmployeeDataset(train_file)  # 200 rows

test_file = ".\\Data\\employee_test.txt"
test_ds = EmployeeDataset(test_file)  # 40 rows

# -----------------------------------------------------------

# 2. create network
print("\nCreating 6-(10-10)-3 NN default init ")
net = Net().to(device)
net.eval()

X = np.array([[-1, 0.30,  0,0,1,  0.5000]],
dtype=np.float32)
X = T.tensor(X, dtype=T.float32).to(device)

print("\nInput = ")
print(X)
z = net(X)
print("\nOutput = ")
print(z)

# 3. load weights and biases
wts = T.arange(start=0, end=213, step=1,
dtype=T.float32).to(device)
wts /= 100.0
print("\nSetting 213 weight/bias values: ")
print(wts[0:6], end=""); print(" . . . ")

print("\nInput = ")
print(X)
z = net(X)
print("\nOutput = ")
print(z)

# -----------------------------------------------------------

# 4. evaluate model accuracy
print("\nComputing model accuracy")
acc_train = accuracy_quick(net, train_ds)
print("Accuracy on train data = %0.4f" % acc_train)
acc_test = accuracy_quick(net, test_ds)
print("Accuracy on test data = %0.4f" % acc_test)