## The Curse of Machine Learning

Most of the machine learning guys I work with suffer from the same curse that I have. We love what we do. The curse is that we’re always thinking about our algorithms and systems and code, even when we’re sleeping (literally).

It’s not uncommon for me to write some code but then several days or weeks later, I’ll realize I wasn’t quite happy with the code. When this happens, I can’t rest easy until I refactor the code in question.

This happened recently when I wrote some code for neural network hyperparameter tuning using evolutionary optimization. See jamesmccaffrey.wordpress.com/2022/11/03/hyperparameter-tuning-using-evolutionary-optimization/.

I woke up one morning with an idea to refactor the code. My original version used a nested class definition. My thought was to ditch the nested class and use an array of Tuple items. I knew I’d never be satisfied until I tried my idea. And so I did.

My demo problem sets up a possible solution that looks like [2 0 7 1 4 4 2 3 5 3]. There are 10 values, where each value is an integer between 0 and 7 inclusive. I set up an artificial error function where the error of a solution is just the sum of values. Therefore, the optimal solution is [0 0 0 0 0 0 0 0 0 0] with error = 0. There are 8 * 8 * . . * 8 = 8^10 = 1,073,741,824 possible solutions.

In high-level pseudo-code, evolutionary optimization is:

```create a population of random solutions
loop max_generations times
pick two parent solutions
make child solution (crossover)
mutate child
evaluate child
replace a weak solution with child
end-loop
return best solution found
```

Anyway, my main point here is not about evolutionary optimization — it’s about the obsession that my colleagues and I have for our work. Obsession can be good or bad. I’ve seen the good (usually) and bad (rarely) sides of work obsession.

Eugenics is the study of how to guide human reproduction and evolution to increase desirable characteristics. Many famous scientists worried that social welfare programs, which encourage unmarried women to have children and subsequent low quality breeding, would cause great harm. Left: William Shockley (1910-1989) was the primary inventor of the transistor at Bell Labs. Nobel Prize in 1956. Center: Francis Crick (1916-2004) is credited, along with James Watson, for discovering the structure of DNA. Nobel Prize in 1962. Right: Karl Pearson (1857-1936) basically created mathematical statistics. One of my heroes.

Demo code.

```# evolutionary_hyperparameter_3.py
# (soln[], err) Tuple version

import numpy as np

# n_hp = 10  10 hyperparameters
# n_hpv = 8  each hyperparam has 8 possible values
# n_pop = 6  population size

# -----------------------------------------------------------

class Solver():
def __init__(self, n_hp, n_hpv, n_pop, seed):
self.rnd = np.random.RandomState(seed)
self.n_hp = n_hp
self.n_hpv = n_hpv
self.n_pop = n_pop

self.pop = []  # list of tuples, tuple is (np arr, float)
for i in range(n_pop):
soln = self.rnd.randint(low=0, high=n_hpv, size=(n_hp))
err = self.compute_error(soln)
self.pop.append((soln, err))

self.pop = sorted(self.pop, key=lambda tup: tup[1])
# (self.pop).sort(key=lambda tup: tup[1])  # in place

# best found at any point in time
self.best_soln = np.copy(self.pop[0][0])  # soln
self.best_err = self.pop[0][1]  # err

def compute_error(self, soln):
err = 0.0
for i in range(len(soln)):  # each hyperparam
err += soln[i]            # small val is low error
return err

def show(self):
for i in range(self.n_pop):
print("[" + str(i) + "]  ", end="")  # the idx
print(self.pop[i][0], end="")        # the soln
print(" | " + str(self.pop[i][1]))   # the err

print("-----")
print(self.best_soln, end="")
print(" | ", end="")
print(self.best_err)

def pick_parents(self):
# pick indices of two solns
first = self.rnd.randint(0, self.n_pop // 2)  # good
second = self.rnd.randint(self.n_pop // 2, self.n_pop)
while second == first:  # not needed this implementation
second = self.rnd.randint(self.n_pop // 2, self.n_pop)
flip = self.rnd.randint(2)  # 0 or 1
if flip == 0:
return (first, second)
else:
return (second, first)

def crossover(self, i, j):
# left half pop[i] with right half pop[j]
child_soln = np.zeros(self.n_hp, dtype=np.int64)
parent1 = self.pop[i][0]
parent2 = self.pop[j][0]
for k in range(0, self.n_hp//2):  # left half
child_soln[k] = parent1[k]
for k in range(self.n_hp//2, self.n_hp):  # right half
child_soln[k] = parent2[k]
return child_soln

def mutate(self, soln):
# a soln is an array with 10 cells, each in [0,7]
idx = self.rnd.randint(0, self.n_hp)  # pick spot
flip = self.rnd.randint(2)  # 0 or 1
if flip == 0:
soln[idx] -= 1
if soln[idx] == -1:
soln[idx] = self.n_hpv-1  # largest
else:
soln[idx] += 1
if soln[idx] == self.n_hpv: # too much
soln[idx] = 0

def search(self, max_gen):
for gen in range(max_gen):
# 1. make a child soln using crossover
(i, j) = self.pick_parents()
child_soln = self.crossover(i, j)

# 2. mutate child
self.mutate(child_soln)
child_err = self.compute_error(child_soln)

# 2b. if new child has already been evaluated,
#     then continue

# 3. is child a new best soln?
if child_err "lt" self.best_err:  # replace with operator
print("New best soln found at gen " + str(gen))
self.best_soln = np.copy(child_soln)
self.best_err = child_err

# 4. replace a weak soln with child
idx = self.rnd.randint(self.n_pop // 2, self.n_pop)
# print("replacing idx = " + str(idx))
self.pop[idx] = (child_soln, child_err)  # Tuple

# 5. sort solns from best to worst
self.pop = sorted(self.pop, key=lambda tup: tup[1])

# -----------------------------------------------------------

def main():
print("\nBegin ")

solver = Solver(10, 8, 6, seed=1)  # 10 params, [0,7], pop=6
print("\nInitial population: ")
solver.show()

print("\nBegin hyperparameter search \n")
solver.search(100)
print("\nDone ")

print("\nFinal population: ")
solver.show()

print("\nEnd ")

if __name__ == "__main__":
main()
```
This entry was posted in Machine Learning. Bookmark the permalink.