The Curse of Machine Learning

Most of the machine learning guys I work with suffer from the same curse that I have. We love what we do. The curse is that we’re always thinking about our algorithms and systems and code, even when we’re sleeping (literally).

It’s not uncommon for me to write some code but then several days or weeks later, I’ll realize I wasn’t quite happy with the code. When this happens, I can’t rest easy until I refactor the code in question.

This happened recently when I wrote some code for neural network hyperparameter tuning using evolutionary optimization. See jamesmccaffrey.wordpress.com/2022/11/03/hyperparameter-tuning-using-evolutionary-optimization/.

I woke up one morning with an idea to refactor the code. My original version used a nested class definition. My thought was to ditch the nested class and use an array of Tuple items. I knew I’d never be satisfied until I tried my idea. And so I did.

My demo problem sets up a possible solution that looks like [2 0 7 1 4 4 2 3 5 3]. There are 10 values, where each value is an integer between 0 and 7 inclusive. I set up an artificial error function where the error of a solution is just the sum of values. Therefore, the optimal solution is [0 0 0 0 0 0 0 0 0 0] with error = 0. There are 8 * 8 * . . * 8 = 8^10 = 1,073,741,824 possible solutions.

In high-level pseudo-code, evolutionary optimization is:

create a population of random solutions
loop max_generations times
  pick two parent solutions
  make child solution (crossover)
  mutate child
  evaluate child
  replace a weak solution with child
end-loop
return best solution found

Anyway, my main point here is not about evolutionary optimization — it’s about the obsession that my colleagues and I have for our work. Obsession can be good or bad. I’ve seen the good (usually) and bad (rarely) sides of work obsession.



Eugenics is the study of how to guide human reproduction and evolution to increase desirable characteristics. Many famous scientists worried that social welfare programs, which encourage unmarried women to have children and subsequent low quality breeding, would cause great harm. Left: William Shockley (1910-1989) was the primary inventor of the transistor at Bell Labs. Nobel Prize in 1956. Center: Francis Crick (1916-2004) is credited, along with James Watson, for discovering the structure of DNA. Nobel Prize in 1962. Right: Karl Pearson (1857-1936) basically created mathematical statistics. One of my heroes.


Demo code.

# evolutionary_hyperparameter_3.py
# (soln[], err) Tuple version

import numpy as np

# n_hp = 10  10 hyperparameters
# n_hpv = 8  each hyperparam has 8 possible values
# n_pop = 6  population size

# -----------------------------------------------------------

class Solver():
  def __init__(self, n_hp, n_hpv, n_pop, seed):
    self.rnd = np.random.RandomState(seed)
    self.n_hp = n_hp
    self.n_hpv = n_hpv
    self.n_pop = n_pop

    self.pop = []  # list of tuples, tuple is (np arr, float)
    for i in range(n_pop):
      soln = self.rnd.randint(low=0, high=n_hpv, size=(n_hp))
      err = self.compute_error(soln)
      self.pop.append((soln, err))

    self.pop = sorted(self.pop, key=lambda tup: tup[1]) 
    # (self.pop).sort(key=lambda tup: tup[1])  # in place

    # best found at any point in time
    self.best_soln = np.copy(self.pop[0][0])  # soln
    self.best_err = self.pop[0][1]  # err
      
  def compute_error(self, soln): 
    err = 0.0
    for i in range(len(soln)):  # each hyperparam
      err += soln[i]            # small val is low error
    return err 

  def show(self):
    for i in range(self.n_pop):
      print("[" + str(i) + "]  ", end="")  # the idx
      print(self.pop[i][0], end="")        # the soln
      print(" | " + str(self.pop[i][1]))   # the err

    print("-----")
    print(self.best_soln, end="")
    print(" | ", end="")
    print(self.best_err)

  def pick_parents(self):
    # pick indices of two solns
    first = self.rnd.randint(0, self.n_pop // 2)  # good
    second = self.rnd.randint(self.n_pop // 2, self.n_pop) 
    while second == first:  # not needed this implementation
      second = self.rnd.randint(self.n_pop // 2, self.n_pop)
    flip = self.rnd.randint(2)  # 0 or 1
    if flip == 0:
      return (first, second)
    else:
      return (second, first)

  def crossover(self, i, j):
    # left half pop[i] with right half pop[j]
    child_soln = np.zeros(self.n_hp, dtype=np.int64)
    parent1 = self.pop[i][0]
    parent2 = self.pop[j][0]
    for k in range(0, self.n_hp//2):  # left half
      child_soln[k] = parent1[k]
    for k in range(self.n_hp//2, self.n_hp):  # right half
      child_soln[k] = parent2[k]
    return child_soln

  def mutate(self, soln):
    # a soln is an array with 10 cells, each in [0,7]
    idx = self.rnd.randint(0, self.n_hp)  # pick spot
    flip = self.rnd.randint(2)  # 0 or 1
    if flip == 0:
      soln[idx] -= 1
      if soln[idx] == -1:
        soln[idx] = self.n_hpv-1  # largest
    else:
      soln[idx] += 1
      if soln[idx] == self.n_hpv: # too much
        soln[idx] = 0

  def search(self, max_gen):
    for gen in range(max_gen):
      # 1. make a child soln using crossover
      (i, j) = self.pick_parents()
      child_soln = self.crossover(i, j)

      # 2. mutate child
      self.mutate(child_soln)
      child_err = self.compute_error(child_soln)

      # 2b. if new child has already been evaluated, 
      #     then continue
      
      # 3. is child a new best soln?
      if child_err "lt" self.best_err:  # replace with operator
        print("New best soln found at gen " + str(gen))
        self.best_soln = np.copy(child_soln)
        self.best_err = child_err

      # 4. replace a weak soln with child
      idx = self.rnd.randint(self.n_pop // 2, self.n_pop)
      # print("replacing idx = " + str(idx))
      self.pop[idx] = (child_soln, child_err)  # Tuple

      # 5. sort solns from best to worst
      self.pop = sorted(self.pop, key=lambda tup: tup[1]) 
  
# -----------------------------------------------------------

def main():
  print("\nBegin ")

  solver = Solver(10, 8, 6, seed=1)  # 10 params, [0,7], pop=6
  print("\nInitial population: ")
  solver.show()

  print("\nBegin hyperparameter search \n")
  solver.search(100)
  print("\nDone ")

  print("\nFinal population: ")
  solver.show()

  print("\nEnd ")

if __name__ == "__main__":
  main()
This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s