Neural Network L1 Regularization using Python

I wrote an article titled “Neural Network L1 Regularization using Python” in the December 2017 issue of Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2017/12/05/neural-network-regularization.aspx.

The most difficult part about L1 regularization is understanding what problem it solves. A full explanation of the problem scenario would take several pages — so, very, very briefly. . .

You can think of a neural network as a very complex math equation that makes a prediction. A neural network was numeric constants called weights (and special weights called biases) that determine the output of the network. Determining the values of the NN weights is called training the network. To train the network, you use a set of training data that has known input values, and known correct output values. You use an optimization algorithm (usually back-propagation) to find values for the weights so that when presented with the inputs in the training data, the computed output values closely match the known, correct output values. (Whew. Believe me, I’ve left out a ton of detail).

A major problem with NN training is called overfitting. This means that if you train the network too well, the network will be very accurate on the test data, but when presented with new, previously unseen data, the trained network predicts poorly.

There are several techniques that can be used to deal with overfitting. One of these techniques is called regularization. There are two main forms of regularization, called L1 and L2. Again, a full explanation of L1 regularization would take many pages so . . .

Overfitting is often characterized by weight values that are very large. L1 regularization uses tricky math during the training process to limit the magnitudes of the network’s weights.

My article shows exactly how you’d go about doing this. In most cases you wouldn’t want to code a NN from scratch. But for me at least, the only way to fully understand what is going on with machine learning libraries like CNTK, TensorFlow, and Keras is to see how the code works.

Advertisements
Posted in Machine Learning | 1 Comment

NFL 2017 Week 15 Predictions – Zoltar Likes Favorites Lions, Chiefs, Ravens, Seahawks

Zoltar is my NFL football machine learning prediction system. It’s a hybrid system that uses a custom reinforcement learning algorithm plus a neural network. Here are Zoltar’s predictions for week #15 of the 2017 NFL season:

Zoltar:     broncos  by    0  dog =       colts    Vegas:     broncos  by    2
Zoltar:       lions  by   10  dog =       bears    Vegas:       lions  by  6.5
Zoltar:      chiefs  by    6  dog =    chargers    Vegas:      chiefs  by    1
Zoltar:       bills  by    2  dog =    dolphins    Vegas:       bills  by    3
Zoltar:      ravens  by   12  dog =      browns    Vegas:      ravens  by  7.5
Zoltar:     jaguars  by    6  dog =      texans    Vegas:     jaguars  by 11.5
Zoltar:      saints  by   10  dog =        jets    Vegas:      saints  by   16
Zoltar:    redskins  by    2  dog =   cardinals    Vegas:    redskins  by  4.5
Zoltar:     vikings  by   10  dog =     bengals    Vegas:     vikings  by 10.5
Zoltar:    panthers  by    5  dog =     packers    Vegas:    panthers  by  2.5
Zoltar:      eagles  by    5  dog =      giants    Vegas:      eagles  by    8
Zoltar:    seahawks  by    6  dog =        rams    Vegas:    seahawks  by    1
Zoltar:    steelers  by    1  dog =    patriots    Vegas:    patriots  by    2
Zoltar:      titans  by    5  dog = fortyniners    Vegas:      titans  by    2
Zoltar:     cowboys  by    0  dog =     raiders    Vegas:     cowboys  by  2.5
Zoltar:     falcons  by    4  dog =  buccaneers    Vegas:     falcons  by    6 

Zoltar theoretically suggests betting when the Vegas line is more than 3.0 points different from Zoltar’s prediction. In previous years, Zoltar would typically have hypothetical suggestions for only two or three games per week. But this season Zoltar and the Las Vegas line have differed greatly, and so Zoltar has been recommending six to eight games per week. The difference this season is the unusually large number of injuries to key players, especially quarterbacks.

This week, Zoltar has six hypothetical suggestions — more than the usual two or three.

1. Zoltar likes the Vegas favorite Lions over the Bears. Vegas has the Lions as a big favorite by 6.5 points, but Zoltar believes the Lions are a very large 10 points better than the Bears. A bet on the Lions will only pay off if the Lions win by more than 6.5 points (in other words, 7 points or more).

2. Zoltar likes the Vegas favorite Chiefs over the Chargers. Vegas has the Chiefs as a tiny favorite by 1.0 point, but Zoltar believes the Chiefs are 6 points better than the Chargers.

3. Zoltar likes the Vegas favorite Ravens against the poor Browns.

4. Zoltar likes the Vegas underdog Texans against the Jaguars. Vegas has the Jaguars as a whopping 11.5 points better than the Texans, but Zoltar thinks the Jaguars are only 6 points better. A bet on the Texans will pay if the Texans win outright, or if the Jaguars win but by less than 11.5 points (11 points or fewer). I think there’s an injury in this game (I haven’t run my advanced Zoltar yet — he takes injuries into account).

5. Zoltar likes the Vegas underdog Jets against the Saints. Vegas has the Saints as 16.0 point favorites — one of the biggest spreads I can remember this season.

6. Zoltar likes the Vegas favorite Seahawks against the Rams. (Historically, the Seahawks have done very poorly against the Rams, which may be affecting the point spread).

==

Zoltar had a decent but not great week last week. Against the Vegas spread, which is what Zoltar is designed to predict, Zoltar went 2-1. (Advanced Zoltar went 1-0 in week #14).

For the 2017 season so far, against the Vegas point spread, Zoltar is a pretty good 42-25 (62% accuracy). If you must bet $110 to win $100 (typical in Vegas) then you must theoretically predict with 53% or better accuracy to make money, but realistically you must predict at 60% or better accuracy.

Just for fun, I also track how well Zoltar does when only predicting which team will win. This isn’t really useful except for parlay betting. For week #14, Zoltar was a pretty good 12-4 just predicting which team would win

For comparison purposes, I also track how well Bing Predicts and the Vegas line do when just predicting which team will win. In week #14, Bing was 8-8, and Vegas was also 8-8 when just predicting winners. This isn’t a surprise, because as far as I can tell, Bing almost always picks the Vegas favorite.

For the 2017 season so far, just predicting the winning team, Zoltar is 143-65 (68.7% accuracy), Bing is 134-74 (64.4% accuracy), and Vegas is 128-73 (63.7% accuracy). The best humans are typically about 67% accurate predicting winners, so currently, Zoltar is slightly better than the best human experts. Bing is doing OK too, being slightly better than Vegas.

Note: Some of my numbers could be off a bit because of some weirdness a few weeks ago with games played outside the U.S.


My brother and I got an electric football game one year for Christmas – loved that game!

Posted in Machine Learning, Zoltar | Leave a comment

The Difference Between Linear Regression and Logistic Regression

Linear regression and logistic regression are two entirely different math techniques that are often confused because their names are similar, and when illustrated in graphs, the two graphs appear very similar.

The graph below illustrates logistic regression. There are two numeric predictor variables, X1 and X2. The goal is to predict the class (0 or 1) for a pair of X1, X2 values. Each pair is plotted as a red dot or a blue dot. Logistic regression finds a line that separates the two classes. In this example, logistic regression might find a horizontal line at X2 = 3.5. Notice no straight line can get all the red dots on one side and all the blue dots on the other. The best you could ever do is get 10 out of 12 classifications correct.

The next graph illustrates linear regression. There is one numeric predictor variable, X, and one numeric value to predict, Y. Linear regression finds the equation of a line that predicts Y from X. Put another way, linear regression finds the best line through the (X, Y) data points.

Notice that you can only make a simple graph for logistic egression when there are two predictor variables, and only make a simple graph for linear regression when there is one predictor variable.

To reiterate, when the simplest examples of logistic regression and linear regression are graphed, they appear somewhat similar and conceptually the goal in both is to find a line of some sort. But the similarity of appearance is deceiving because the two techniques are entirely different.

Posted in Machine Learning | Leave a comment

A First Look at ONNX

Open Neural Network Exchange Format (ONNX) version 1.0 was released on Wednesday, December 6, 2017. From what I can tell, ONNX is a specification standard for neural network models, so that different deep learning libraries can work together.

According to the Web site at http://www.onnx.ai ONNX was created by Microsoft and Facebook. The ONNX specification appears to have official support from Amazon (AWS), as well as hardware companies AMD, ARM, IBM, Intel, Huawei, NVIDIA, and Qualcomm.

Neural network tools initially supported by ONNX v1 include CNTK, PyTorch, Apache MXNet, Caffe2, and TensorRT. Noticeably missing is official support from Google and their TensorFlow library. However, it appears that there is some sort of converter that allows indirect interoperability with TensorFlow.

The field of AI/ML/NNs is evolving very rapidly, so I’ll have to keep an eye on ONNX.

Now if I was reading this blog post, at this point I’d have only a vague idea of what ONNX is. For me, I never understand a technology until I can see the code in action. So, I whipped up a simple neural network using the CNTK library, version 2.3 and saved the model using the statement:

nnet.save(".\\iris_fnn_v2.model",
  format=C.ModelFormat.CNTKv2)

Then I wrote another little program that loaded the saved model and then used it. The load statement was:

model = C.ops.functions.Function.load(".\\iris_fnn_v2.model",
  format=C.ModelFormat.ONNX)

OK, everything seemed to work.

Then I changed the save and load statements to use the new ONNX format:

nnet.save(".\\iris_fnn_onnx.model",
  format=C.ModelFormat.ONNX)

model = C.ops.functions.Function.load(".\\iris_fnn_onnx.model",
  format=C.ModelFormat.ONNX)

And everything seemed to work again.

Nice. If you examine the two screenshots, you’ll see I got slightly different formats for output values, so there seems to be some minor differences in the models.

Anyway, if ONNX gains traction, it will likely allow different deep learning libraries to interoperate. This could have a huge impact on machine learning and AI.

Posted in CNTK, Machine Learning | Leave a comment

Writing CNTK Programs using the VS Code Editor

The Microsoft CNTK framework/library is a collection of powerful functions that you can use to write deep learning systems, for example, a deep neural network classifier. The most common way to use CNTK is to write a Python language program which calls into the CNTK functions.

So, to write a CNTK program, you’re really writing a Python program. Somewhat weirdly, to edit a Python+CNTK program, my technique of choice is to use Notepad. Yes, plain old Notepad, as in just about the simplest text editor imaginable.

I’ve never quite found a fancy Python editor with debugger that I really like. But I keep trying different Python editors — there are dozens. The other day, I revisited the Visual Studio Code (VS Code) program. VS Code is sort of like a scaled down, simplified version of the Visual Studio (VS) program. Note that the similarity in names — Visual Studio Code vs. Visual Studio, and VS Code vs. VS — has caused quite a bit of confusion.

VS Code is a free, open source, cross-platform, multiple-language, programming editor and debugger. Contrary to what some references state, VS Code is definitely not a lightweight program — it has a significant learning curve (but nothing like the beast that is Visual Studio).

Well, the bottom line is that I really like VS Code a lot. I could list all sorts of technical pros and cons of VS Code relative to other Python+CNTK programming environments, but the reality is that subjective factors are always more important. Basically, VS Code just feels right.

Note that for some complex scenarios (such as integrating CNTK code with Azure Cloud storage), you probably need something like the full Visual Studio with the Visual Studio Tools for AI add-on extension. But for simple scenarios, VS Code may be a good choice. See https://code.visualstudio.com/. VS Code also supports the new Visual Studio Tools for AI extension.


“Venice Twilight” (1908) – Claude Monet. Impressionism is simple but powerful.

Posted in CNTK, Machine Learning | Leave a comment

Understanding k-NN Classification using C#

I wrote an article titled “Understanding k-NN Classification using C#” in the December 2017 issue of Microsoft MSDN Magazine. See https://msdn.microsoft.com/en-us/magazine/mt814421.

The goal of k-NN (“k nearest neighbors”) classification is to predict the class of an item based on two or more predictor variables. For example, you might want to predict the political leaning (conservative, moderate, liberal) of a person based on their age, income, years of education, and number of children.

The technique is very simple. You obtain a set of training data that has known input and class values. Then for an unknown item, you find the k nearest training data points, and then predict the most common class.

In the image below, there are three classes, indicated by the red, green, and yellow data points. Each item has two predictor variables. The blue dot is the unknown. If you set k = 4, the four closest points to the blue dot are the red at (5,3), the yellow at (4,2), the yellow at (4,1), and the green at (6,1). The most common class is yellow, so you predict yellow.

Compared to other classification algorithms, the advantages of k-NN classification include: easy to implement, can be easily modified for specialized scenarios, works well with complex data patterns, and the results are somewhat interpretable.

Disadvantages of k-NN classification include: the result can be sensitive to your choice of the value for k, the technique works well only when all predictor variables are strictly numeric, it’s possible to get a tie result prediction, and the technique doesn’t work well with huge training data sets.

Posted in Machine Learning | Leave a comment

NFL 2017 Week 14 Predictions – Zoltar Likes Two Favorites and One Dog

Zoltar is my NFL football machine learning prediction system. It’s a hybrid system that uses a custom reinforcement learning algorithm plus a neural network. Here are Zoltar’s predictions for week #14 of the 2017 NFL season:

Zoltar:     falcons  by    2  dog =      saints    Vegas:      saints  by    1
Zoltar:      texans  by    9  dog = fortyniners    Vegas:      texans  by    3
Zoltar:       bills  by    6  dog =       colts    Vegas:       bills  by    4
Zoltar:     packers  by   12  dog =      browns    Vegas:     packers  by  3.5
Zoltar:     bengals  by    6  dog =       bears    Vegas:     bengals  by  6.5
Zoltar:       lions  by    0  dog =  buccaneers    Vegas:       lions  by    1
Zoltar:     cowboys  by    3  dog =      giants    Vegas:     cowboys  by    5
Zoltar:     vikings  by    0  dog =    panthers    Vegas:     vikings  by    3
Zoltar:      chiefs  by    2  dog =     raiders    Vegas:      chiefs  by    4
Zoltar:     broncos  by    2  dog =        jets    Vegas:        jets  by    1
Zoltar:    chargers  by    4  dog =    redskins    Vegas:    chargers  by    6
Zoltar:      titans  by    1  dog =   cardinals    Vegas:      titans  by    3
Zoltar:      eagles  by    0  dog =        rams    Vegas:        rams  by    2
Zoltar:    seahawks  by    0  dog =     jaguars    Vegas:     jaguars  by    3
Zoltar:    steelers  by    7  dog =      ravens    Vegas:    steelers  by    7
Zoltar:    patriots  by    6  dog =    dolphins    Vegas:    patriots  by   11

Zoltar theoretically suggests betting when the Vegas line is more than 3.0 points different from Zoltar’s prediction. In previous years, Zoltar would typically have hypothetical suggestions for only two or three games per week. But this season Zoltar and the Las Vegas line have differed greatly, and so Zoltar has been recommending six to eight games per week. The difference this season is the unusually large number of injuries to key players. (Note: I don’t actually bet on games — for me the whole point is the machine learning algorithms. Well, OK, I’ll place a bet every now and then, just for fun).

This week, Zoltar is somewhat back to normal and has just three hypothetical suggestions.

1. Zoltar likes the Vegas favorite Texans over the 49ers. Vegas has the Texans as a moderate favorite by 3.0 points, but Zoltar believes the Texans are a large 9 points better than the 49ers. A bet on the Texans will only pay off if the Texans win by more than 3.0 points (in other words, 4 points or more — if the Texans win by exactly 3 points the game is a push).

2. Zoltar likes the Vegas favorite Packers over the Browns. Vegas has the Packers as a moderate favorite by 3.5 points, but Zoltar believes the Packers are a huge 12 points better than the Browns.

3. Zoltar likes the Vegas underdog Dolphins against the Patriots. Vegas thinks the Patriots are a huge 11.0 points better than the Dolphins, but Zoltar computes that the Patriots are only 6 points better. Note: I have an advanced version of Zoltar, and advanced Zoltar thinks the Patriots are 14 points better than the Dolphins, and so does not recommend any action on this game.

==

Zoltar had a pretty good week last week. Against the Vegas spread, which is what Zoltar is designed to predict, Zoltar went a nice 4-2. (Advanced Zoltar went 3-1 in week #13).

For the 2017 season so far, against the Vegas point spread, Zoltar is a decent-but-not-great 40-24 (62% accuracy). If you must bet $110 to win $100 (typical in Vegas) then you must theoretically predict with 53% or better accuracy to make money, but realistically you must predict at 60% or better accuracy.

Just for fun, I also track how well Zoltar does when only predicting which team will win. This isn’t really useful except for parlay betting. For week #13, Zoltar was a very good 13-3 just predicting which team would win

For comparison purposes, I also track how well Bing Predicts and the Vegas line do when just predicting which team will win. In week #13, Bing was a good 12-4, and Vegas was a weak 11-5 when just predicting winners.

For the 2017 season so far, just predicting the winning team, Zoltar is 131-61 (68% accuracy), Bing is 127-65 (66% accuracy), and Vegas is 120-65 (65% accuracy). The best humans are typically about 67% accurate predicting winners, so currently, Zoltar and Bing are just about as good as the best humans when just predicting which team will win.


My system is named after the Zoltar arcade fortune teller machine

Posted in Machine Learning, Zoltar | Leave a comment