NFL 2017 Week 6 Predictions – Zoltar Likes Underdogs Dolphins, Jets, Lions, and Giants

Zoltar is my NFL football machine learning prediction system. Here are Zoltar’s predictions for week #6 of the 2017 NFL season:

Zoltar:    panthers  by    1  dog =      eagles    Vegas:    panthers  by    3
Zoltar:     packers  by    0  dog =     vikings    Vegas:     packers  by  off
Zoltar:      ravens  by   10  dog =       bears    Vegas:      ravens  by    7
Zoltar:     falcons  by    6  dog =    dolphins    Vegas:     falcons  by   11
Zoltar:      texans  by   11  dog =      browns    Vegas:      texans  by  9.5
Zoltar:    redskins  by   11  dog = fortyniners    Vegas:    redskins  by   10
Zoltar:    patriots  by    4  dog =        jets    Vegas:    patriots  by  9.5
Zoltar:       lions  by    0  dog =      saints    Vegas:      saints  by  4.5
Zoltar:     jaguars  by    2  dog =        rams    Vegas:     jaguars  by  2.5
Zoltar:  buccaneers  by    0  dog =   cardinals    Vegas:  buccaneers  by  2.5
Zoltar:     raiders  by    9  dog =    chargers    Vegas:     raiders  by    3
Zoltar:      chiefs  by    7  dog =    steelers    Vegas:      chiefs  by    4
Zoltar:     broncos  by    6  dog =      giants    Vegas:     broncos  by   10
Zoltar:      titans  by    6  dog =       colts    Vegas:       colts  by  off

Zoltar theoretically suggests betting when the Vegas line is more than 3.0 points different from Zoltar’s prediction. For week #6 Zoltar has five hypothetical suggestions. As in recent weeks, most (four) of the five are on underdogs. It seems like Zoltar-2017 has a bias for underdogs, so I need to give Zoltar a tune-up when I get time.

1. Zoltar likes the Vegas underdog Dolphins against the Falcons. Vegas believes the Falcons are 11.0 points better than the Dolphins, but Zoltar thinks the Falcons are only 6 points better. So, Zoltar thinks the Falcons will win but not by more than 11.0 points. Therefore, a bet on the Dolphins would pay you if the Dolphins win (by any score), or the Falcons win, but by less than 11 points.

2. Zoltar likes the Vegas underdog Jets against the Patriots. Vegas has the Patriots as 9.5 favorites but Zoltar thinks the Patriots are only 4 points better. Historically, Zoltar has done very well when picking the Jets as underdogs, and Zoltar has done terribly when picking against the Patriots as favorites. I’ll be interested in how this game plays out.

3. Zoltar likes the Vegas underdog Lions against the Saints. Vegas has the Saints as favorites by 4.5 points but Zoltar thinks the Saints and Lions are evenly matched.

4. Zoltar recommends the Vegas favorite Raiders against the Chargers. Vegas has the Raiders as just 3.0 points better than the Chargers but Zoltar thinks the Raiders are 9 points better. Zoltar believes that humans are over-emphasizing the Raiders unexpectedly narrow win and the Chargers surprisingly large win last week.

5. Zoltar likes the Vegas underdog Giants against the Broncos. Vegas believes that the Broncos are a huge 10.0 points better but Zoltar thinks the Broncos are only 6 points better than the Giants.

When I ran Zoltar on a Tuesday morning, the Vegas point spread was “off” for two games: Packers vs. Vikings, and Titans vs. Colts. I’ll update if necessary if either of those games goes “on”.

Update: The Vegas line for the Packers – Vikings game is Packers favored by 3.5 points, so Zoltar recommends the underdog Vikings. The line for Titans vs. Colts is Titans by 8.0 points, so Zoltar does not recommend a bet.

==

Week #5 was good-news, bad-news — but mostly good-news for Zoltar. Against the Vegas spread, which is what matters most, Zoltar went a nice 3-1. And Zoltar would have been 4-0 if the Buccaneers kicker had been able to make a short field goal against the Patriots. That missed field goal caused a swing of tens of millions of dollars in actual betting.

On the other hand, if only a few plays had been different last week, Zoltar could just as easily been 0-4. For the season, against the Vegas point spread, Zoltar is 13-6 (68% accuracy).

I also track how well Zoltar does when just predicting which team will win. This isn’t really useful except for parlay betting. Zoltar was a poor 8-6 just predicting winners.

For comparison purposes, I track how well Bing and the Vegas line do when just predicting who will win. In week #5, Bing was a mediocre 7-7 and Vegas was 6-8.

For the 2017 season so far, just predicting the winning team, Zoltar is 50-27 (65% accuracy), Bing is 45-32 (58% accuracy), and Vegas is 42-33 (56% accuracy).


My system is named after the Zoltar fortune teller machine

Advertisements
Posted in Machine Learning, Zoltar | Leave a comment

Neural Network L2 Regularization using Python

I wrote an article titled “Neural Network L2 Regularization using Python” in the September 2017 issue of Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2017/09/01/neural-network-l2.aspx.

You can think of a neural network as a complicated mathematical prediction equation. To compute the constants (called weights and biases) that determine the behavior of the equation, you use a set of training data. The training data has known, correct input and output values. You use an algorithm (most often back-propagation) to find values for the NN constants so that computed output values closely match the correct output values in the training data.

A challenge when training a NN is called over-fitting. If you train a network too well, you will get very low error or equivalently, high accuracy) on the training data. But when you apply your NN model to new, previously unseen data, your accuracy is very low.

There are several ways to try and limit NN over-fitting. One technique is called regularization. As it turns out, an over-fitted NN model often has constants that are very large in magnitude. Regularization keeps the values of the NN constants small.

There are two main forms of regularization, L1 and L2. L1 regularization penalizes the sum of the magnitudes of all the NN weights. L2 regularization penalizes the sum of the squared weights. My article explains exactly how L2 regularization works, and compares L1 and L2 regularization.

From a developer’s point, L2 regularization generally (but not always) works a bit better than L1 regularization. And L2 is a tiny bit easier to implement than L1. But L1 sometimes (but not always) automatically prunes away irrelevant predictor variables by setting their associated weight constants to zero.

As often with machine learning, working with regularization is part art and part science and part intuition and part experience.


“A Perfect Fit” (1863) – Luis Ruiperez

Posted in Machine Learning | 2 Comments

Writing CNTK Code using Visual Studio 2017

I took a first stab at writing CNTK based machine learning code using Visual Studio 2017. It worked very nicely and I’m optimistic that CNTK with VS may become my default environment for ML code development.

Visual Studio is Microsoft’s tool for developers to write software. VS has been around for many years. It’s a very complex, very powerful tool.

CNTK v2 is a relatively new code library of sophisticated machine learning functions. CNTK itself is written in C++ but the usual way to use CNTK is through a Python language API.

VS did not directly supported Python until just a few weeks ago. So, in theory at least, all the parts needed to code a CNTK program using VS were in place.

I installed VS2017 on a new machine. Because it was a new installation, I had the option of installing Python support, and so I did. The default Python install for VS2017 has Anaconda v4.4.0 but CNTK requires the older Anaconda v4.1.1 so I found the older version and installed it. Anaconda is a base Python plus many related libraries that are more or less essential, notably NumPy and SciPy.

I installed the legacy Anaconda 4.1.1 then I used the PIP utility to install CNTK v2.2 by pointing to a .whl (“wheel”) file. As I’m writing this blog post, it occurs to me that all this sounds somewhat complicated. It is. In a few years, all of these dependencies will probably just be part of a regular VS installation.

With everything installed, I launched Visual Studio and created a new Python application. I found the Python Environments window in VS, and changed the default from v4.4.0 to my older version of v4.1.1 and then I waited several minutes for the system to update itself — downloading and installing data for the auto-complete feature of VS.

I typed in some CNTK code, and the auto-complete feature worked perfectly:

And then I ran the demo through the debugger, and, somewhat surprisingly, the program ran:

There’s no real big moral to the story here. I was able to get a CNTK environment up and running in VS2017 quite easily because I’ve been doing things like this for many years. I suspect that someone new to ML + VS Python + CNTK would have a very rough time getting everything to work.

Posted in CNTK

Time Series Regression using a C# Neural Network

I wrote an article titled “Time Series Regression using a C# Neural Network” in the October 2017 issue of Microsoft MSDN Magazine. See https://msdn.microsoft.com/en-us/magazine/mt826350.

I started my article by noting that the goal of a time-series regression problem is to make predictions based on historical time data. For example, if you have monthly sales data (over the course of a year or two), you might want to predict sales for the upcoming month. Time-series regression is usually very difficult, and there are many different techniques you can use.

In my article, I tackle a standard time series problem where the data is total international airline passengers per month, from January 1949 through December 1960 (144 months). I used a “rolling window” approach where the raw data is configured to look like this:

1.12, 1.18, 1.32, 1.29, 1.21
1.18, 1.32, 1.29, 1.21, 1.35
1.32, 1.29, 1.21, 1.35, 1.48
1.29, 1.21, 1.35, 1.48, 1.48
1.21, 1.35, 1.48, 1.48, 1.36
. . .
6.06, 5.08, 4.61, 3.90, 4.32

Each value is the number of passengers in 100,000s. Each set of four consecutive months is used to predict the passenger count for the next month. The window size of four is arbitrary and in general you must use trial and error to determine a good window size for each problem.

I used a neural network approach. In essence, this is just a normal regression problem (i.e., the goal is to predict a single numeric value) with specially formed training data.

The prediction model worked pretty well:

Time series regression is very challenging. This simple example is relatively easy, but real-life time series problems are among the most difficult problems in all of machine learning.


The Time Machine (1960)

Posted in Machine Learning | 3 Comments

The Primal and Dual of an Optimization Problem

A couple of days ago, I was listening to an interesting talk on unsupervised machine learning classification. At one point, the speaker mentioned the primal-dual for an optimization problem.

I hadn’t thought about P-D for a long time. I used to teach a Quantitative Methods class several years ago where the idea popped up. The basic idea is that for some optimization problems, solving the original problem as stated is difficult, but solving a problem derived from the original is easier. The original problem is called the primal and the derived problem is called the dual.

For example, suppose you want to solve this linear programming problem:

constraint: 3x1 +  x2 <= 6
constraint: 2x1 + 4x2 <= 4
maximize  : 3x1 + 4x2

constraint: x1 >= 0, x2 >=0

This primal problem has solution x1 = 2, x2 = 0 and so the max is 3*2 + 4*0 = 6.

To form the dual, you first put the coefficients into matrix form:

 3  1  6
 2  4  4
 3  4  1

Notice I added a 1 in the lower right corner. Next, you transpose the matrix (the rows become columns):

 3  2  3
 1  4  4
 6  4  1

Now you rewrite, using new variables, flipping primary constraint inequalities, and switch max to min:

constraint: 3y1 + 2y2 >= 3
constraint:  y1 + 4y2 >= 4
minimize  : 6y1 + 4y2

constraint: y1 >= 0, y2 >=0

This dual problem has solution y1 = 0, y2 = 1.5 and so the min is 6*0 + 4(1.5) = 6 which is the same value as the primal.


This image popped up from an Internet search for “duality”

Posted in Machine Learning

NFL 2017 Week 5 Predictions – Zoltar Likes Four Underdogs and One Favorite

Zoltar is my NFL football prediction computer program. Here are Zoltar’s predictions for week #5 of the 2017 NFL season:

Zoltar:    patriots  by    0  dog =  buccaneers    Vegas:    patriots  by  4.5
Zoltar:        jets  by    3  dog =      browns    Vegas:      browns  by    2
Zoltar:      eagles  by    6  dog =   cardinals    Vegas:      eagles  by  6.5
Zoltar:    dolphins  by    2  dog =      titans    Vegas:      titans  by    3
Zoltar:    steelers  by   10  dog =     jaguars    Vegas:    steelers  by    9
Zoltar:       bills  by    0  dog =     bengals    Vegas:     bengals  by    3
Zoltar:      giants  by    7  dog =    chargers    Vegas:      giants  by    4
Zoltar:       lions  by    6  dog =    panthers    Vegas:       lions  by    3
Zoltar:       colts  by    9  dog = fortyniners    Vegas:       colts  by  2.5
Zoltar:     raiders  by    6  dog =      ravens    Vegas:      ravens  by    0
Zoltar:    seahawks  by    0  dog =        rams    Vegas:        rams  by    1
Zoltar:     cowboys  by    5  dog =     packers    Vegas:     cowboys  by  2.5
Zoltar:      chiefs  by    3  dog =      texans    Vegas:      chiefs  by  1.5
Zoltar:     vikings  by    2  dog =       bears    Vegas:     vikings  by  3.5

Zoltar theoretically suggests betting when the Vegas line is more than 3.0 points different from Zoltar’s prediction. For week #5 Zoltar has five hypothetical suggestions. Four of the five are on underdogs — I’m sensing that Zoltar-2017 has a bit too much bias for underdogs so I should probably give Zoltar a tune-up when I get time.

1. Zoltar likes the Vegas underdog Buccaneers against the Patriots. Vegas believes the Patriots are 4.5 points better than the Buccaneers, but Zoltar thinks the two teams are evenly matched, and so the Patriots may win but they won’t cover the point spread. A bet on the Buccaneers will pay you if the Buccaneers win (by any score), or if the Patriots win but by 4 points or less.

2. Zoltar likes the Vegas underdog Jets against the Browns. Vegas has the currently winless Browns as favorites by 2.0 points but Zoltar thinks the Jets are 3 points better than the Browns.

3. Zoltar (somewhat inexplicably) likes the Vegas underdog Dolphins against the Titans. Vegas has the Titans as favorites by 3.0 points but Zoltar thinks the Dolphins are 2 points better. I better check under Zoltar’s hood — this prediction doesn’t make human-sense.

4. Zoltar recommends the Vegas favorite Colts against the 49ers. Vegas has the Colts as 2.5 point favorites, but Zoltar thinks the Colts are 9 points better than the 49ers.

5. Zoltar tentatively likes the Vegas underdog Raiders against the Ravens. When I ran Zoltar on a Tuesday morning, the Vegas point spread was “off” because of an injury to the Raiders quarterback, so I’ll have to see where the line goes.

Update: The Vegas line for the Raiders – Ravens game is Raiders favored by 3.0 points so Zoltar does not have a recommendation for the game.

==

Week #4 was good-ish and bad-ish, but mostly bad-ish for Zoltar. Against the Vegas spread, which is what matters most, Zoltar went only 1-2. Zoltar correctly predicted the Vegas underdog Jets would beat the Jaguars. But Zoltar didn’t believe Vegas favorites Saints (vs. Dolphins) and Seahawks (vs. Colts) would cover the spread. Ouch. The Saints and the Seahawks destroyed their opponents. Well, that’s why it’s called gambling. For the season, against the Vegas spread, Zoltar is still a pretty good 10-5 for 67% accuracy.

I also track how well Zoltar does when just predicting which team will win. This isn’t really useful except for parlay betting. Zoltar was a decent 10-6 just predicting winners. For the season, Zoltar is 42-21 (67% accuracy).

For comparison purposes, I track how well Bing and the Vegas line do when just predicting who will win. In week #4, Bing was a mediocre 8-8 and Vegas was 8-7. (Vegas had one game with no favorite). For the season so far, just predicting the winning team, Bing is 38-25 (60% accuracy) and Vegas is 36-25 (59% accuracy).

Posted in Machine Learning, Zoltar

Is it Possible for Squared Error and Cross Entropy Error to Disagree?

I was giving a micro-talk about neural network classification error and accuracy, and in mid-talk I wondered out loud if it’d be possible for mean squared error and mean cross entropy error to give a different result with regards to which of two prediction models is better?

One of the people who was listening to my talk, Ryan G, came up with a pathological example showing it is possible. I expanded his example slightly:

Here there are two different NN prediction models. For simplicity, there are just two things to predict: (1,0,0) and (0,1,0). These would correspond to a problem where there are three class labels to predict, such as “democrat”, “republican”, “other”.

Model A outputs probabilities (0.51, 0.48, 0.01) and (0.03, 0.50, 0.47). Model A gets both predictions correct so it has 100% classification accuracy. The squared error for the first item is (0.51 – 1)^2 + (0.48 – 0)^2 + (0.01 – 0)^2 = 0.4706. Similarly, the squared error for the second data item = 0.4718. So the mean squared error for Model A is 0.4706 + 0.4718 / 2 = 0.4712.

The cross entropy error (also called log-loss) for the first item in model A is – [ ln(0.51) * 1 + ln(0.48) * 0 + ln(0.01) * 0 ] = 0.6733. Similarly, the cross entropy error for the second item = 0.6931. So the mean cross entropy error for Model A is 0.6832.

Now, the second Model, model B, outputs probabilities (0.48, 0.26, 0.26) and (0.28, 0.49, 0.23). The mean squared error for Model B is 0.3985 and the mean cross entropy error is 0.7237.

Therefore, using mean squared error you’d conclude that Model B is better because its MSE is lower (0.3985 vs. 0.4712). But using mean cross entropy error you’d conclude that Model A is better (0.6832 vs. 0.7237).

This pathological example points out that squared error takes into account the all output probabilities but cross entropy error only looks at the one probability associated with the 1 target value.

You’d certainly never encounter this situation in any real-life scenario I can think of, but it was an interesting little puzzle.


“Entropy” by Don Allen

Posted in Machine Learning | 1 Comment