“Researchers Evaluate the Top Four AI Stories of 2022” on the Pure AI Web Site

I contributed to an article titled “Researchers Evaluate the Top Four AI Stories of 2022” in the January 2023 edition of the Pure AI web site. See https://pureai.com/articles/2023/01/05/top-ai-stories-of-2022.aspx.

I am a regular contributing editor for the Pure AI site. For this article, I collaborated with two other AI experts and we reviewed AI/ML news stories from 2022. We ultimately agreed that four significant stories were:

The release of the DALL-E 2 system to generate artificial images.
The development of the AlphaFold system to predict protein structure.
The development of the Cicero system to play master-level Diplomacy.
The release of ChatGPT to answer arbitrary text-based questions.



Images generated by DALL-E by a prompt of, “A painting of a fox sitting in a field at sunrise in the style of Claude Monet”. OK, but how can this generate significant revenue?


Of these four, the two that I had the strongest opinions on were AlphaFold and ChatGPT. I was impressed with AlphaFold and said that it’s possible the system could lead to huge advances in biology. I was not impressed with ChatGPT, feeling it’s over-hyped. The main problem with ChatGPT is that there’s little or no control over the source of responses to a question like, “Why did Russia invade Ukraine?”



Research shows that women chat much more than men in social scenarios. There are surprisingly few chatty women robots in science fiction movies. Left: Ava is an advanced experiment in “Ex Machina” (2014). Center: Rachel is an administrative assistant in “Blade Runner” (1982). Right: Arlette is a harlot in “Westworld” (1973).


Posted in Machine Learning | Leave a comment

What Are Correct Values for Precision and Recall When the Denominators Are Zero?

I did an Internet search for “What are correct values for precision and recall when the denominators equal 0?” and was pointed to a StackExchange page which had been up for over 11 years — and which was somewhat ambiguous. See https://stats.stackexchange.com/questions/8025/what-are-correct-values-for-precision-and-recall-when-the-denominators-equal-0.

The source of the issue is the definitions of FP (false positives) and FN (false negatives).

Years ago, I was taught that TP (true positives) are actual class positive (1) that are predicted correctly. FP (false positives) are actual class positive (1) that are predicted incorrectly. TN (true negatives) are actual class negative (0) that are predicted correctly. FN (false negatives) are actual class negative (0) that are predicted incorrectly. These definitions make sense from an English point of view.

However, a more common definition nowadays is that FP is an actual negative that’s predicted incorrectly, and FN is an actual positive that’s predicted incorrectly.

Precision and recall are defined as:

p (precision) = TP / (TP + FP)
r (recall)    = TP / (TP + FN)

Here’s an example using the modern, more common definitions:

actual  predicted
  0       0        TN  
  0       0        TN
  0       1        FP
  0       1        FP
  1       0        FN
  1       1        TP
  1       1        TP
  1       1        TP

p = TP / (TP + FP) = 3 / (3+2) = 3/5

r = TP / (TP + FN) = 3 / (3+1) = 3/4

OK. But what if either denominator is 0. For precision if TP + FP = 0, then TP = 0 and also FP = 0. The only way this could happen is if all predictions are 0, for example:

actual  predicted
  0       0        TN  
  0       0        TN
  0       0        TN
  0       0        TN
  1       0        FN
  1       0        FN
  1       0        FN
  1       0        FN

In other words, all predictions are negative. This should raise a warning that the prediction system could be broken. The StackExchange page states, “If (true positives + false positives) = 0 then all cases have been predicted to be negative.” This is true if the more common definitions of FP and FN are used.

Now for recall, if TP + FN = 0, then TP = 0 and also FN = 0. This could happen like this:

actual  predicted
  0       0        TN  
  0       0        TN
  0       1        FP
  0       1        FP
  0       0        TN
  0       0        TN
  0       0        TN
  0       0        TN

All actual labels are negative. If any actual data is positive, than there’d be at least one TP (if predicted correctly) or one FN (if predicted incorrectly). This scenario (all positive data) should never happen. The StackExchange page states, “If (true positives + false negatives) = 0 then no positive cases in the input data.”

Bottom line:

Using the common definitions of FP and FN:

1.) If TP + FP = 0, a warning should be printed that the prediction system is likely flawed because it always predicts class negative.

2.) If TP + FN = 0, a warning should be printed that the data is flawed because there are no actual positives.

Do not believe everything you read on the Internet without questioning it.



I’m not sure why, but I always associate the word “precision” with high quality wrist watches. Here are three watches that don’t immediately bring the idea of precision to mind. Left: This one is a-mazing. Center: I don’t know why all wrist watches don’t have built-in lighters. What could go wrong? Right: Very stylish, but the vacuum tubes might not be necessary.


Posted in Machine Learning | 1 Comment

NFL 2022 Season Super Bowl LVII Prediction – Zoltar Predicts the Chiefs Will Beat the Eagles

Zoltar is my NFL football prediction computer program. It uses reinforcement learning and a neural network. Here are Zoltar’s predictions for week #22 (Super Bowl LVII) of the 2022 season.

Zoltar:      chiefs  by    3  dog =     eagles    Vegas:      eagles  by  2.0

Zoltar theoretically suggests betting when the Vegas line is “significantly” different from Zoltar’s prediction. For this season I’ve been using a threshold of 4 points difference but in some previous seasons I used 3 points.

For Super Bowl LVII (week #22) Zoltar thinks the Kansas City Chiefs are 3 points better than the Philadelphia Eagles. Las Vegas thinks the Eagles are 2.0 points better than the Chiefs. Therefore, the difference of opinion is 5 points and so Zoltar suggests a wager on the Chiefs.

A bet on the Chiefs will pay off if the Chiefs win by any score, or if the Eagles win by less than 2.0 points (i.e., 1 point). If the Eagles win by exactly 2 points, the bet is a push.

Theoretically, if you must bet $110 to win $100 (typical in Vegas) then you’ll make money if you predict at 53% accuracy or better. But realistically, you need to predict at 60% accuracy or better.

In week #21, against the Vegas point spread, Zoltar went 0-0 using 4.0 points as the advice threshold because Zoltar’s two predictions (49ers at Eagles, and Bengals at Chiefs) both agreed closely with Vegas predictions.

For the season, against the spread, Zoltar is 57-31 (~64% accuracy).

Just for fun, I track how well Zoltar does when just trying to predict just which team will win a game. This isn’t useful except for parlay betting. In week #21, just predicting the winning team, Zoltar went 2-0. Vegas also went 2-0 at just predicting the winning team.



My prediction system is named after the Zoltar fortune teller machine you can find in arcades. Arcade Zoltar is named after the Zoltar machine that was featured in the 1988 movie “Big” where the machine was magical and granted a boy’s wish to become an adult. The movie Zoltar was named after a 1960s era fortune teller machine named Zoltan.


Posted in Zoltar | Leave a comment

“Logistic Regression from Scratch Using Raw Python” in Visual Studio Magazine

I wrote an article titled “Logistic Regression from Scratch Using Raw Python” in the January 2023 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2023/01/18/logistic-regression.aspx.

Logistic regression is a machine learning technique for binary classification. For example, you might want to predict the sex of a person (male or female) based on their age, state where they live, income and political leaning. There are many other techniques for binary classification, but logistic regression was one of the earliest developed and the technique is considered a fundamental machine learning skill for data scientists.

The article presents a complete end-to-end demo. The goal is to predict a person’s sex based on age, State, income, and political leaning. The data looks like:

1   0.24   1   0   0   0.2950   0   0   1
0   0.39   0   0   1   0.5120   0   1   0
1   0.63   0   1   0   0.7580   1   0   0
0   0.36   1   0   0   0.4450   0   1   0
1   0.27   0   1   0   0.2860   0   0   1
. . .

The tab-delimited fields are sex (0 = male, 1 = female), age (divided by 100), state (Michigan = 100, Nebraska = 010, Oklahoma = 001), income (divided by $100,000) and political leaning (conservative = 100, moderate = 010, liberal = 001).

The structure of the demo program is:

# people_gender_log_reg.py
# Anaconda3-2020.02  Python 3.7.6
# Windows 10/11

import numpy as np

# -----------------------------------------------------------

def compute_output(w, b, x): . . . 
def accuracy(w, b, data_x, data_y): . . . 
def mse_loss(w, b, data_x, data_y): . . . 

# -----------------------------------------------------------

def main():
  # 0. get ready
  print("Begin logistic regression with raw Python demo ")
  np.random.seed(1)

  # 1. load data into memory
  # 2. create model
  # 3. train model
  # 4. evaluate model
  # 5. use model
  
  print("End People logistic regression demo ")

if __name__ == "__main__":
  main()

Logistic regression is relatively easy to implement — so much so that a from-scratch version is easier for me than using a logistic regression module from a library like ML.NET or scikit-learn.



School employees create signage from scratch, but sometimes they’d be better off getting someone else to do the message spelling.


Posted in Machine Learning | Leave a comment

Solving the Traveling Salesman Problem (TSP) Using an Epsilon-Greedy Algorithm

An epsilon-greedy algorithm is a general approach that can be used for many different problems. I recently devised a nice evolutionary algorithm for the Traveling Salesman Problem (TSP) that seems to work very well. Just for fun, I spent one weekend morning implementing an epsilon-greedy algorithm for TSP.

Briefly: the epsilon-greedy approach did not work as well as my evolutionary algorithm.

In high-level pseudo-code, an epsilon-greedy algorithm for TSP looks like:

create a random guess solution
loop many times
  use curr guess to generate a candidate guess
  if candidate is best seen so far, save it
  generate an epsilon in [0, 1]
  if epsilon less-than small value:
    update curr guess (accept epsilon worse solution)
  else if candidate is better than curr:
    update curr guess (greedy part)
  else:
    no to curr guess
end-loop
return best guess found

The value of epsilon must be determined by trial and error. Typical values are things like 0.01 and 0.05.

A non-technical issue here is that like most of the machine learning engineers and scientists I know, once I latch onto a problem like TSP, I find it difficult to avoid trying multiple algorithms, even in cases where I’m quite sure some algorithms don’t work well.



In computer science, “epsilon” usually refers to a very small value. Here is an example of a small processor from IBM hardware research. The processor is about the size of a grain of salt, and has roughly the power of a 1990s CPU.


Demo code:

# tsp_epsilon_greedy.py

import numpy as np

# -----------------------------------------------------------

class Solver():
  def __init__(self, num_cities, distances, seed):
    self.num_cities = num_cities
    self.rnd = np.random.RandomState(seed)
    self.distances = distances  # by ref

    self.curr_soln = np.arange(num_cities, dtype=np.int64)
    self.rnd.shuffle(self.curr_soln)
    self.curr_err = self.compute_error(self.curr_soln)

    self.best_soln = np.copy(self.curr_soln)  # soln
    self.best_err = self.curr_err            # err
 
  def show(self):
    # print("Best solution: ")
    print(self.best_soln, end ="")
    print(" | %0.4f " % self.best_err)

  def compute_error(self, soln):
    n = self.num_cities
    d = 0.0
    for i in range(n-1):  # note
      from_idx = soln[i]
      to_idx = soln[i+1]
      d += self.distances[from_idx][to_idx]
    return d

  def solve(self, max_iter):
    for iter in range(max_iter):
      # 1. pick two indices
      idx1 = self.rnd.randint(0,self.num_cities)
      idx2 = self.rnd.randint(0,self.num_cities)
      # print(idx1); print(idx2); input()

      # 2. create an offspring/candidate
      candidate = np.copy(self.curr_soln)
      tmp = candidate[idx1]
      candidate[idx1] = candidate[idx2]
      candidate[idx2] = tmp
      cand_err = self.compute_error(candidate)
      # print(candidate); print(cand_err); input()

      # 3. is candidate new best?
      if cand_err "lt" self.best_err:
        self.best_soln = np.copy(candidate)
        self.best_err = cand_err

      # 4. if candidate better then src, replace
      epsilon = self.rnd.random()
      # print(epsilon)
      if epsilon "lt" 0.01:
        self.curr_soln = np.copy(candidate)
        self.curr_err = cand_err
      elif cand_err "lt" self.best_err:
        # print("new curr soln ")
        self.curr_soln = np.copy(candidate)
        self.curr_err = cand_err
      else:
        pass
        # no change to curr soln
        # print("no change to curr soln ")
    
# -----------------------------------------------------------

def get_distances(n_cities):
  # from text file in non-demo scenario
  result = np.zeros((n_cities, n_cities), dtype=np.float32)
  for i in range(n_cities):
    for j in range(n_cities):
      if i "lt" j: result[i][j] = (j - i) * 1.0
      elif j "lt" i: result[i][j] = (i - j) * 1.5
  return result

# -----------------------------------------------------------

def main():
  print("\nBegin TSP using epsilon-greedy ")
  print("Setting TSP n = 20 cities ")
  print("Note: 20! = 2,432,902,008,176,640,000 ")
  print("Optimal soln is 0 1 2 . . 19 with dist = 19.0 ")
      
  num_cities = 20
  distances = get_distances(num_cities)
    
  solver = Solver(num_cities, distances, seed=1)
  print("\nInitial guess: ")
  solver.show()

  print("\nBegin search ")
  solver.solve(10000)
  print("Done ")

  print("\nFinal best estimate: ")
  solver.show()

  print("\nEnd demo ")

# -----------------------------------------------------------

if __name__ == "__main__":
  main()
Posted in Machine Learning | Leave a comment

Another Look at GPT-3 / Codex / GitHub Copilot – I Have Mixed Opinions

GPT-3 (“Generative Pre-trained Transformer”) is a large language model that can generate text, such as a response to, “Write two paragraphs about the history of computer programming.” GPT3 was trained on an enormous corpus of text — Wikipedia, books, blogs, and so on. GPT3 is owned by OpenAI with major funding by Microsoft. ChatGPT is a chatbot based on GPT-3.

Codex is an extension of GPT-3 and ChatGPT where additional training on billions of lines of computer code was applied — so in addition to natural language, Codex understands natural language and computer code in several languages such as Python, JavaScript, C#, C++, HTML, CSS, SQL, and others.

You use GPT-3 and Codex by typing queries into a text box. A Codex query might be, “Show me a C# function for binary search.” The result would be code that you can copy-paste.

GitHub Copilot is essentially a wrapper over Codex that’s integrated into a development environment such as Microsoft Visual Studio. This is more efficient because you get the Codex results directly into a program instead of having to copy-paste. A downside is that, after a free trial period of 60 days, you must pay $10 per month for Copilot (currently — by the time you read this he cost might be different).

A long time ago, I experimented with a pre-release version of Copilot. It had quite a few bugs and glitches. I figured I’d investigate the released version.

Because I use Visual Studio for most of my C# development, I decided to install Copilot on VS. Installation directions for Copilot at https://docs.github.com/en/copilot/quickstart were very good and I had no major trouble getting Copilot installed. Briefly, I logged onto my GitHub account, went to Settings, and requested access to Copilot. After giving my credit card info, GitHub asked for an authorization code. I opened up VS and created a dummy project and then added the GitHub Copilot Extension from the Microsoft Store (rather than from GitHub as I had first expected, because there is billing involved). VS then gave me an 8-character authorization code which I placed into GitHub. I restarted VS and viola, Copilot was installed in VS.

Using Copilot in VS is quite a remarkable experience. Briefly, you enter a comment and then hit enter. I used a dummy Console Application (shell) C# project.

I typed the comment:

// show code for a list of color items

and got:

public class ColorItem
{
  public string Color { get; set; }
  public string Name { get; set; }
}

Cool. Then I typed:

// create a list of color items

I got the following. Notice Copilot knew the current conversation context.

List colorItems = new List
{
  new ColorItem { Color = "Red", Name = "Red" },
  new ColorItem { Color = "Green", Name = "Green" },
  new ColorItem { Color = "Blue", Name = "Blue" }
};

Next I typed:

// now function to print the list

and got:

public void PrintColorItems()
{
  foreach (var item in colorItems)
  {
    Console.WriteLine(item.Color + " " + item.Name);
  }
}

I have strong mixed opinions about Codex / Copilot. Copilot is essentially an efficient alternative to issuing queries into Google and then reading through blog posts, Stack Overflow answers, and so on. But the inefficiency of the Google approach is useful because it simultaneously teaches me and gets lodged into my memory. Copilot just gives an answer with no active learning going on and so my skills don’t grow.

My conclusion is that Copilot (and Codex) is probably good for simple programming tasks such as generating CSS, SQL, or common C# code like an array binary search function. But Copilot is probably not good for advanced algorithms such as those I use in my machine learning work.



Two parodies of computer programming book covers that are spot-on.


Posted in Machine Learning | Leave a comment

Binary Classification Using a scikit Decision Tree

I hadn’t looked at using a decision tree from the scikit-learn (scikit for short) library for several months, so I figured to do an example. Before I go any further: I am not a big fan of decision trees and this example reinforces my opinion.

I used one of my standard datasets for binary classification. The data looks like:

  1   0.24   1 0 0   0.2950   0 0 1
  0   0.39   0 0 1   0.5120   0 1 0
  1   0.63   0 1 0   0.7580   1 0 0
  0   0.36   1 0 0   0.4450   0 1 0
. . . 

Each line of data represents a person. The fields are sex (male = 0, female = 1), age (normalized by dividing by 100), state (Michigan = 100, Nebraska = 010, Oklahoma = 001), annual income (divided by 100,000), and politics type (conservative = 100, moderate = 010, liberal = 001). The goal is to predict the gender of a person from their age, state, income, and politics type. The data can be found at: https://jamesmccaffrey.wordpress.com/2022/09/23/binary-classification-using-pytorch-1-12-1-on-windows-10-11/.

It isn’t necessary to normalize age and income. Converting categorical predictors like state and job-type is conceptually tricky, but the bottom line is that in most scenarios it’s best to one-hot encode rather than categorical encode. Ordinal data like low = 0, medium = 1, high = 2 can be ordinal-encoded.

The scikit documentation states that for binary classification, the variable to predict should be encoded as -1 and +1. However, one of the documentation examples uses 0 and 1 which is more consistent with other binary classification algorithms. In fact, it’s possible to use text labels, such as “M” and “F”, for the target class too. The scikit documentation has quite a few inconsistencies like this.

The key lines of code are:

md = 4  # depth
print("Creating decision tree with max_depth=" + str(md))
model = tree.DecisionTreeClassifier(max_depth=md,
  random_state=1) 
model.fit(train_x, train_y)
print("Done ")

Decision trees are highly sensitive to overfitting. If you set a large max_depth you can get 100% classification on training data but the accuracy on test data and new previously unseen data will likely be very poor.

I wrote a custom tree_to_pseudo() function that displays a tree in text format. Note: I kludged the function together from examples I found on the Internet — the code is very, very tricky. The first part of the output of tree_to_pseudo() looks like:

 if ( income "lte" 0.3400 ) {
   if ( pol2 "lte"  0.5 ) {
     if ( age "lte"  0.235 ) {
       if ( income "lte"  0.2815 ) {
         return 1.0
       } else {
         return 0.0
       }
     } else {
       return 1.0
     }
   } else {
     return 1.0
   }
 } else {
. . . 

There is also a built-in tree.export_text() function that gives similar results:

|--- income "lte"  0.34
|   |--- pol2 "lte"  0.50
|   |   |--- age "lte"  0.23
|   |   |   |--- income "lte" = 0.28
|   |   |   |   |--- class: 1.0
. . .

Anyway, the demo was a good refresher for me.



A few years ago some researchers in the UK did an experiment where they determined that people can identify the gender of a person solely by how they walk. Fashion models extend this idea to exaggerated walking styles. Left: The long stride. Center: The walk on a straight line. Right: The sway.


Demo code. Replace “lte” with Boolean less-than-or-equal operator. The data can be found at jamesmccaffrey.wordpress.com/2022/09/23/binary-classification-using-pytorch-1-12-1-on-windows-10-11/.

# people_gender_tree.py

# predict gender (0 = male, 1 = female) 
# from age, state, income, politics-type

# data:
#  0   0.39   0   0   1   0.5120   0   1   0
#  1   0.27   0   1   0   0.2860   1   0   0
# . . . 

# Anaconda3-2020.02  Python 3.7.6
# Windows 10/11  scikit 0.22.1

import numpy as np 
from sklearn import tree 

# ---------------------------------------------------------

def tree_to_pseudo(tree, feature_names):
  left = tree.tree_.children_left
  right = tree.tree_.children_right
  threshold = tree.tree_.threshold
  features = [feature_names[i] for i in tree.tree_.feature]
  value = tree.tree_.value

  def recurse(left, right, threshold, features, node, depth=0):
    indent = "  " * depth
    if (threshold[node] != -2):
      print(indent,"if ( " + features[node] + " <= " + \
        str(threshold[node]) + " ) {")
      if left[node] != -1:
        recurse(left, right, threshold, features, \
          left[node], depth+1)
        print(indent,"} else {")
        if right[node] != -1:
          recurse(left, right, threshold, features, \
            right[node], depth+1)
        print(indent,"}")
    else:
      idx = np.argmax(value[node])
      # print(indent,"return " + str(value[node]))
      print(indent,"return " + str(tree.classes_[idx]))

  recurse(left, right, threshold, features, 0)

# ---------------------------------------------------------

def main():
  # 0. get ready
  print("\nBegin scikit decision tree example ")
  print("Predict sex from age, state, income, politics ")
  np.random.seed(0)

  # 1. load data
  print("\nLoading data into memory ")
  train_file = ".\\Data\\people_train.txt"
  train_xy = np.loadtxt(train_file, usecols=range(0,9),
    delimiter="\t", comments="#",  dtype=np.float32) 
  train_x = train_xy[:,1:9]
  train_y = train_xy[:,0]

  test_file = ".\\Data\\people_test.txt"
  test_xy = np.loadtxt(test_file, usecols=range(0,9),
    delimiter="\t", comments="#",  dtype=np.float32) 
  test_x = test_xy[:,1:9]
  test_y = test_xy[:,0]

  np.set_printoptions(suppress=True)
  print("\nTraining data:")
  print(train_x[0:4])
  print(". . . \n")
  print(train_y[0:4])
  print(". . . ")

  # 2. create and train 
  md = 4
  print("\nCreating decision tree max_depth=" + str(md))
  model = tree.DecisionTreeClassifier(max_depth=md,
    random_state=1) 
  model.fit(train_x, train_y)
  print("Done ")

  # 3. evaluate
  acc_train = model.score(train_x, train_y)
  print("\nAccuracy on train = %0.4f " % acc_train)
  acc_test = model.score(test_x, test_y)
  print("Accuracy on test = %0.4f " % acc_test)

  # 3b. use model
  # print("\nPredict age 36, Oklahoma, $50K, moderate ")
  # x = np.array([0.36, 0,0,1, 0.5000, 0,1,0],
  #   dtype=np.float32)
  # predicted = model.predict(x)
  # print(predicted)

  # 4. visualize
  print("\nTree in pseudo-code: ")
  tree_to_pseudo(model, ["age", "state0", "state1", "state2",
    "income",  "pol0", "pol1", "pol2"])

  # 4b. use built-in export_text()
  # recall: from sklearn import tree
  pseudo = tree.export_text(model, ["age", "state0", "state1",
    "state2", "income",  "pol0", "pol1", "pol2"])
  print(pseudo)

  # 4c. use built-in plot_tree()
  import matplotlib.pyplot as plt
  tree.plot_tree(model, feature_names=["age", "state0",
    "state1", "state2", "income",  "pol0", "pol1", "pol2"],
    class_names=["male", "female"])
  plt.show()

  # 5. TODO: save model using pickle

if __name__ == "__main__":
  main()
Posted in Scikit | Leave a comment

NFL 2022 Week 21 (Conference Championships) Predictions – Zoltar Likes the 49ers More Than Las Vegas Does

Zoltar is my NFL football prediction computer program. It uses reinforcement learning and a neural network. Here are Zoltar’s predictions for week #21 (conference championship games) of the 2022 season.

Zoltar: fortyniners  by    0  dog =      eagles    Vegas:      eagles  by  2.5
Zoltar:      chiefs  by    4  dog =     bengals    Vegas:      chiefs  by  2.5

Zoltar theoretically suggests betting when the Vegas line is “significantly” different from Zoltar’s prediction. For this season I’ve been using a threshold of 4 points difference but in some previous seasons I used 3 points.

For week #21 Zoltar agrees closely with the Las Vegas point spread. The largest difference of opinion is that Zoltar thinks that the 49ers vs. Eagles game is a toss-up, while Las Vegas has the Eagles as 2-point favorites.

Because Zoltar’s two predictions are within 4 points of the Vegas point spread, Zoltar has no advice.

Theoretically, if you must bet $110 to win $100 (typical in Vegas) then you’ll make money if you predict at 53% accuracy or better. But realistically, you need to predict at 60% accuracy or better.

In week #20, against the Vegas point spread, Zoltar went 0-0 using 4.0 points as the advice threshold because Zoltar’s four predictions all agreed closely with Vegas predictions (just like this week).

For the season, against the spread, Zoltar is 57-31 (~64% accuracy).

Just for fun, I track how well Zoltar does when just trying to predict just which team will win a game. This isn’t useful except for parlay betting. In week #20, just predicting the winning team, Zoltar went 3-1 which is about average. Vegas was also 3-1 at just predicting the winning team. Both Zoltar and Vegas figured that the Bills would beat the Bengals, but the Bengals won rather convincingly by a score of 27-10.



My prediction system is named after the Zoltar fortune teller machine you can find in arcades. I am not embarrassed to admit that I like old animated cartoons that are aimed at an adult audience as well as children.

Left: This is a scene from “Snoopy Come Home” (1972). Charlie Brown and Peppermint Patty consult a fortune teller.

Center: A scene from an “Ounce of Pink” (1965). The Pink Panther comes across an annoying talking fortune teller machine that convinces Panther to take him home.

Right: A scene from “The Weather Lady” (1963) episode of the Rocky and Bullwinkle show. The citizens of Frostbite Falls buy a fortune teller to predict the weather. The machine is stolen by Boris Badenov and Natasha Fatale.


Posted in Zoltar | Leave a comment

Revisiting Binary Classification Using scikit Logistic Regression

It had been a while since I looked at logistic regression using the scikit-learn (scikit or sklearn for short) machine learning library. Like any kind of skill, it’s important to stay in practice.

I used one of my standard datasets for binary classification. The data is synthetic and looks like:

 1   0.24   1 0 0   0.2950   0 0 1
 0   0.39   0 0 1   0.5120   0 1 0
 1   0.63   0 1 0   0.7580   1 0 0
 0   0.36   1 0 0   0.4450   0 1 0
. . . 

Each line of tab-delimited data represents a person. The fields are sex (male = 0, female = 1), age (normalized by dividing by 100), state (michigan = 100, nebraska = 010, oklahoma = 001), annual income (divided by 100,000), and politics type (conservative = 100, moderate = 010, liberal = 001). The goal is to predict the gender of a person from their age, state, income, and politics type.

There are 200 lines of training data and 40 lines of test data. The complete data can be found at:
jamesmccaffrey.wordpress.com/2022/09/23/binary-classification-using-pytorch-1-12-1-on-windows-10-11/

I used the version of scikit that was installed with Anaconda Python version Anaconda3-2020.02 (with Python 3.7.6), which is scikit version 0.22.1.

Using scikit has pros and cons. The pros are that scikit easy to use and has a lot of nice built-in modules. The cons are that scikit is difficult to customize and the code is essentially a black box (open source but impossible to decipher).

The key statements are:

model = LogisticRegression(random_state=0,
  solver='sag', max_iter=1000, penalty='none')
model.fit(train_x, train_y)

The SAG (stochastic average gradient) algorithm is a variation of ordinary SGD (stochastic gradient descent). The penalty can be L1, or L2, or elastic (combination of L1 and L2).

My scikit logistic regression demo got 72.50% accuracy on the test data. A PyTorch binary classifier network got 85.00% accuracy. A from-scratch Python version of logistic regression got 77.50% accuracy.



There are some interesting analogies between the evolution/development of aircraft design and and the evolution/development of machine learning algorithms. Here are three aircraft designs that have a circular design theme but which weren’t successful. Left: The DFW T.28 “Floh” (“Flea” in German) was built in 1917 in Germany by Hermann Dorner. Center: The Vought V-173 “Flying Pancake” was built in 1942 to explore reduced-drag designs. Right: The Stipa was an experimental Italian aircraft designed in 1932. It had a hollow fuselage with the engine and propeller completely enclosed.


Demo code. Replace “lt” with Boolean operator symbol.

# people_gender_scikit.py

# predict gender (0 = male), 1 = female) 
# from age, state, income, job-type

# data:
# 1   0.24   1   0   0   0.2950   0   0   1
# 0   0.39   0   0   1   0.5120   0   1   0
# 1   0.27   0   1   0   0.2860   0   0   1
# . . . 

# Anaconda3-2020.02  Python 3.7.6
# scikit 0.22.1  Windows 10/11

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_score
import pickle

def show_confusion(cm):
  # Confusion matrix whose i-th row and j-th column entry
  # indicates the number of samples with true label being
  # i-th class and predicted label being j-th class.

  ct_act0_pred0 = cm[0][0]  # TN
  ct_act0_pred1 = cm[0][1]  # FP wrongly predicted as pos
  ct_act1_pred0 = cm[1][0]  # FN wrongly predicted as neg 
  ct_act1_pred1 = cm[1][1]  # TP
  
  print("actual 0  | %4d %4d" % (ct_act0_pred0, ct_act0_pred1))
  print("actual 1  | %4d %4d" % (ct_act1_pred0, ct_act1_pred1))
  print("           ----------")
  print("predicted      0    1")
  
# -----------------------------------------------------------

def main():
  # 0. get ready
  print("\nBegin logistic regression with scikit ")
  np.random.seed(1)

  # 1. load data
  print("\nLoading data into memory ")
  train_file = ".\\Data\\people_train.txt"
  train_xy = np.loadtxt(train_file, usecols=range(0,9),
    delimiter="\t", comments="#",  dtype=np.float32) 
  train_x = train_xy[:,1:9]
  train_y = train_xy[:,0]

  test_file = ".\\Data\\people_test.txt"
  test_xy = np.loadtxt(test_file, usecols=range(0,9),
    delimiter="\t", comments="#",  dtype=np.float32) 
  test_x = test_xy[:,1:9]
  test_y = test_xy[:,0]

  print("\nTraining data:")
  print(train_x[0:4])
  print(". . . \n")
  print(train_y[0:4])
  print(". . . ")

  # 2. create model and train
  print("\nCreating logistic regression model")
  model = LogisticRegression(random_state=0,
    solver='sag', max_iter=1000, penalty='none')
  model.fit(train_x, train_y)

  # 3. evaluate
  print("\nComputing model accuracy ")
  acc_train = model.score(train_x, train_y)
  print("Accuracy on training = %0.4f " % acc_train)

  acc_test = model.score(test_x, test_y)
  print("Accuracy on test = %0.4f " % acc_test)

  y_predicteds = model.predict(test_x)
  precision = precision_score(test_y, y_predicteds)
  print("Precision on test = %0.4f " % precision)

  # 4. make a prediction 
  print("\nPredict age 36, Oklahoma, $50K, moderate ")
  x = np.array([[0.36, 0,0,1, 0.5000, 0,1,0]],
    dtype=np.float32)
  
  p = model.predict_proba(x) 
  p = p[0][1]  # first (only) row, second value P(1)

  print("\nPrediction prob = %0.6f " % p)
  if p "lt" 0.5:
    print("Prediction = male ")
  else:
    print("Prediction = female ")

  # 5. save model
  print("\nSaving trained logistic regression model ")
  path = ".\\Models\\people_scikit_model.sav"
  pickle.dump(model, open(path, "wb"))

  # with open(path, 'rb') as f:
  #   loaded_model = pickle.load(f)
  # pa = loaded_model.predict_proba(x)
  # print(pa)

  # 6. confusion matrix with labels
  from sklearn.metrics import confusion_matrix
  cm = confusion_matrix(test_y, y_predicteds)
  print("\nConfusion matrix raw: ")
  print(cm)

  print("\nConfusion matrix custom: ")
  show_confusion(cm)
 
  print("\nEnd People logistic regression demo ")

if __name__ == "__main__":
  main()
Posted in Scikit | Leave a comment

Getting Ready for the PyTorch 2.0 Neural Network Library

The PyTorch web site announced that PyTorch 2.0 is scheduled to be released sometime in March 2023. This is a big deal because major versions (1.0, 2.0, 3.0, etc.) only appear once every few years.

I figured I’d investigate version 2.0. Bottom line: My experiment was only partially successful. Specifically, I was able to get a nightly build version of PyTorch 2.0 installed on a Windows 10/11 machine, but the key feature of 2.0 — the compile() function — wasn’t supported on Windows yet. I don’t fully understand the compile() function but I think it’s purpose is to improve speed/performance.

Update: I also tried PyTorch 2.0 on a MacOS machine using the February 3 nightly build but I wasn’t successful.

First, I upgraded my Python from my current version 3.7.6 to version 3.9.13 by installing Anaconda3-2022.10. I think Python 3.9 is required for PyTorch 2.0 but I’m not sure. Versioning is always a big problem in the Python/PyTorch ecosystem.

Next, I got one of my standard PyTorch multi-class classification demos running on the current PyTorch version 1.13.1.

And next, I went to the nightly build repository at https://download.pytorch.org/whl/nightly/torch/ and downloaded the most recent torch-2.0.0.dev20230122+cpu-cp39-cp39-win_amd64.whl file. I used pip to uninstall PyTorch version 1.13.1 and then I installed the development version of 2.0.

My demo network was:

import torch as T

class Net(T.nn.Module):
  def __init__(self):
    super(Net, self).__init__()
    self.hid1 = T.nn.Linear(6, 10)  # 6-(10-10)-3
    self.hid2 = T.nn.Linear(10, 10)
    self.oupt = T.nn.Linear(10, 3)

    T.nn.init.xavier_uniform_(self.hid1.weight)
    T.nn.init.zeros_(self.hid1.bias)
    T.nn.init.xavier_uniform_(self.hid2.weight)
    T.nn.init.zeros_(self.hid2.bias)
    T.nn.init.xavier_uniform_(self.oupt.weight)
    T.nn.init.zeros_(self.oupt.bias)

  def forward(self, x):
    z = T.tanh(self.hid1(x))
    z = T.tanh(self.hid2(z))
    z = T.log_softmax(self.oupt(z), dim=1)  # NLLLoss() 
    return z

The statements to use the new compile() function were:

  print("Creating 6-(10-10)-3 neural network ")
  net_basic = Net().to(device)
  net = T.compile(net_basic)
  net.train()  # set into training mode

Alas, when I ran the program I got an error message of, “UserWarning: Windows is not currently supported, torch.compile() will do nothing.”

Oh well, the process could have been much more painful and I’m one step closer to installing and using PyTorch 2.0.



The first generation of jet fighters were those developed in the 1940s, at the end of World War II. The second generation of fighters were those developed in the 1950s. There were many successful second generation jet fighters, such as the Lockheed F-104 Starfighter and the Vought F-8 Crusader. But there were many unsuccessful second generation fighter designs too.

Left: The Curtiss-Wright XF-87 Blackhawk. Although the XF-89 was a good plane, a competing design, the Northrop F-89 Scorpion, was judged superior and went into production.

Center: The McDonnell XF-88 Voodoo was a good design but it was canceled due to budget cuts related to the Korean War. A few years later, the design was upgraded and put into production as the successful F-101 Voodoo.

Right: The Lockheed XF-90 was an absolutely beautiful plane, but it was too heavy and underpowered, and so it never went into production. The poor XF-90 never even got a nickname. Some of the technology developed for the XF-90 was later used for the successful F-104.


Posted in PyTorch | Leave a comment