## Parameterizing PyTorch Neural Network Architecture and Training Values for Evolutionary Optimization

When creating a neural network prediction model, you have to set values for the architecture (number hidden layers, number hidden nodes in each layer, hidden activation, etc.) and training (optimizer, batch size, etc.) In some scenarios you can manually experiment with these hyperparameter values. In other scenarios, you can set up lists of possible values and then use random search or grid search.

A more sophisticated approach is to use evolutionary optimization to find a good set of architecture and training values. This is a project I’ve been looking at recently. As part of my experiments, I put together a demo that parameterizes a network and training values and then computes a fitness value. The idea is best explained by code.

Suppose you want to predict the political leaning of a person (conservative = 0, moderate = 1, liberal = 2) from their sex (male = -1, female = +1), age (divided by 100), State (Michigan = 100, Nebraska = 010, Oklahoma = 001), and income (divided by \$100,000). Now, consider this code:

```  # first, create train_ds and test_ds
print("Setting 6-(10-10)-3 tanh 10 0.01 1000 SGD")
f = fitness(n_hid=10, activ='tanh',
trn_ds=train_ds, tst_ds=test_ds,
bs=10, lr=0.01, me=1000, opt='sgd')
print("Fitness = %0.4f " % f)
```

The fitness function creates a 6-(10-10)-3 neural network classifier with tanh() hidden node activation, and trains it using a batch size of 10, stochastic gradient descent with a learning rate of 0.01, and 1000 epochs. The return value is a measure of how good the network is, often called a fitness value in evolutionary optimization terminology.

The fitness() function is very short and simple because the function farms out most of the work to program-defined train() and accuracy() functions:

```def fitness(n_hid=10, activ='tanh', trn_ds=None, tst_ds=None,
bs=10, lr=0.01, me=1000, opt='sgd'):

T.manual_seed(1)  # prepare
np.random.seed(1)

net = Net(n_hid, activ).to(device)  # create

net.train()
train(net, trn_ds, bs, lr, me, opt)  # train

net.eval()
acc_train = accuracy_quick(net, trn_ds)  # evaluate
acc_test = accuracy_quick(net, tst_ds)
return (acc_train + acc_test) / 2
```

I decided to define fitness as the average of the accuracy of the trained network on the training and test data. This is something I need to give more thought to.

I don’t believe it’s feasible to create a general purpose framework for parameterization — each problem is significantly different. The real decisions are what to parameterize and what to hard-code. For example, my demo hard-codes the architecture with a fixed two hidden layers rather than a variable number of layers.

The parameterization is just the first part of an evolutionary optimization system. My next steps will be to add functions to generate random solutions, select two parent solutions, combine two parents to produce a child solution, and mutate child solutions.

Fascinating stuff (to me anyway).

Evolution has produced some strange animals. Left: Tullimonstrum, informally known as the Tully monster, is an extinct invertebrate that lived about 300 million years ago. It was about 14 inches long and had two primitive eye stalks. Right: Opabinia is an extinct arthropod that lived about 500 million years ago. It was about three inches long and had five eyes. Images like these in my head are one of several reasons why I don’t eat calamari.

Demo code below. The training and test data can be found at https://jamesmccaffrey.wordpress.com/2022/09/01/multi-class-classification-using-pytorch-1-12-1-on-windows-10-11/.

```# people_politics_encoded.py
# predict politics type from sex, age, state, income
# PyTorch 2.0.1-CPU Anaconda3-2022.10  Python 3.9.13
# Windows 10/11

# experiemnt for hyperparameter evolutionary optimization

import numpy as np
import torch as T
device = T.device('cpu')  # apply to Tensor or Module

# -----------------------------------------------------------

class PeopleDataset(T.utils.data.Dataset):
# sex  age    state    income   politics
# -1   0.27   0  1  0   0.7610   2
# +1   0.19   0  0  1   0.6550   0
# sex: -1 = male, +1 = female
# politics: conservative, moderate, liberal

def __init__(self, src_file):
tmp_x = all_xy[:,0:6]   # cols [0,6) = [0,5]
tmp_y = all_xy[:,6]     # 1-D

self.x_data = T.tensor(tmp_x,
dtype=T.float32).to(device)
self.y_data = T.tensor(tmp_y,
dtype=T.int64).to(device)  # 1-D

def __len__(self):
return len(self.x_data)

def __getitem__(self, idx):
preds = self.x_data[idx]
trgts = self.y_data[idx]
return preds, trgts  # as a Tuple

# -----------------------------------------------------------

class Net(T.nn.Module):
def __init__(self, n_hid, activ='tanh'):
super(Net, self).__init__()
self.hid1 = T.nn.Linear(6, n_hid)  # 6-(nh-nh)-3
self.hid2 = T.nn.Linear(n_hid, n_hid)
self.oupt = T.nn.Linear(n_hid, 3)

if activ == 'tanh':
self.activ = T.nn.Tanh()
elif activ == 'relu':
self.activ = T.nn.ReLU()

# use default weight init

def forward(self, x):
z = self.activ(self.hid1(x))
z = self.activ(self.hid2(z))
z = T.log_softmax(self.oupt(z), dim=1)  # NLLLoss()
return z

# -----------------------------------------------------------

def accuracy_quick(model, dataset):
# assumes model.eval()
X = dataset[0:len(dataset)][0]
Y = dataset[0:len(dataset)][1]
oupt = model(X)  #  [40,3]  logits
arg_maxs = T.argmax(oupt, dim=1)  # argmax() is new
num_correct = T.sum(Y==arg_maxs)
acc = (num_correct * 1.0 / len(dataset))
return acc.item()

# -----------------------------------------------------------

def train(net, ds, bs, lr, me, opt='sgd'):
# dataset, bat_size, lrn_rate, max_epochs, optimizer
shuffle=True)
loss_func = T.nn.NLLLoss()
if opt == 'sgd':
optimizer = T.optim.SGD(net.parameters(), lr=lr)

print("\nStarting training ")
for epoch in range(0, me):
epoch_loss = 0.0  # for one full epoch
for (batch_idx, batch) in enumerate(train_ldr):
X = batch[0]  # inputs
Y = batch[1]  # correct class/label/politics

oupt = net(X)
loss_val = loss_func(oupt, Y)  # a tensor
epoch_loss += loss_val.item()  # accumulate
loss_val.backward()
optimizer.step()

if epoch % le == 0:
print("epoch = %5d  |  loss = %10.4f" % \
(epoch, epoch_loss))
print("Done ")

# -----------------------------------------------------------

def fitness(n_hid=10, activ='tanh', trn_ds=None, tst_ds=None,
bs=10, lr=0.01, me=1000, opt='sgd'):

T.manual_seed(1)  # prepare
np.random.seed(1)

net = Net(n_hid, activ).to(device)  # create

net.train()
train(net, trn_ds, bs, lr, me, opt)  # train

net.eval()
acc_train = accuracy_quick(net, trn_ds)  # evaluate
acc_test = accuracy_quick(net, tst_ds)
return (acc_train + acc_test) / 2

# -----------------------------------------------------------

def main():
# 0. get started
print("\nBegin People predict politics type ")

print("\nCreating People Datasets ")

train_file = ".\\Data\\people_train.txt"
train_ds = PeopleDataset(train_file)  # 200 rows

test_file = ".\\Data\\people_test.txt"
test_ds = PeopleDataset(test_file)    # 40 rows

# 2. compute fitness for architecture and train parameters
print("\nSetting 6-(10-10)-3 tanh 10 0.01 1000 SGD")
f = fitness(n_hid=10, activ='tanh',
trn_ds=train_ds, tst_ds=test_ds,
bs=10, lr=0.01, me=1000, opt='sgd')
print("\nFitness = %0.4f " % f)

print("\nSetting 6-(8-8)-3 relu 10 0.01 1000 Adam")
f = fitness(n_hid=8, activ='relu',
trn_ds=train_ds, tst_ds=test_ds,