NFL 2016 Week 19 Predictions – Zoltar Likes the Chiefs over the Steelers

Zoltar is my NFL prediction computer program. Here are Zoltar’s predictions for week 19 (division playoffs) of the 2016 NFL season:


Zoltar:     falcons  by    4  dog =    seahawks    Vegas:     falcons  by  4.5
Zoltar:    patriots  by   10  dog =      texans    Vegas:    patriots  by   16
Zoltar:      chiefs  by    6  dog =    steelers    Vegas:      chiefs  by    1
Zoltar:     cowboys  by    2  dog =     packers    Vegas:     cowboys  by    4

Zoltar theoretically suggests betting when the Vegas line is more than 3.0 points different from Zoltar’s prediction. For week 19 Zoltar has one suggestion.

1. Zoltar likes the Vegas favorite Chiefs against the Steelers. Zoltar thinks the Chiefs are 6.0 points better than the Steelers, but Vegas says the Chiefs will win by only 1.0 point. So a bet on the Chiefs will pay off if the Chiefs win by 2 or more points.

Note: Zoltar theoretically likes the Vegas underdog Texans against the Patriots, however my advanced Zoltar says the Patriots will crush the Texans. Also, advanced Zoltar says that the Falcons will beat the Seahawks by 9.0 points, mostly due to the injury to the Seahawks strong safety combined with the Falcon’s passing statistics.


In week 18, Zoltar went 0-0 against the Vegas point spread, because he didn’t recommend any bets.

For the 2016 regular season plus playoff games, Zoltar is 43-28 against the Vegas spread, for 61% accuracy. Historically, Zoltar is usually between 62% and 72% accuracy against the Vegas spread over the course of an entire season, so Zoltar is doing only OK.

Theoretically, if you must bet $110 to win $100 (typical) then you’ll make money if you predict at 53% accuracy or better. But realistically, you need to predict at 60% accuracy or better.

Just for fun, I track how well Zoltar and Cortana/Bing Predictions do when just trying to predict just which team will win a game. This isn’t useful except for parlay betting.

In week 18, just predicting winners, Zoltar was 4-0 (but so was almost everyone else). Cortana/Bing was 3-1 just predicting winners (Cortana incorrectly predicted the Raiders would beat the Texans).

For the 2016 season, just predicting winners, Zoltar is 178-80 (69% accuracy). Cortana/Bing is 165-93 (64% accuracy). There were two tie games in the season, which I didn’t include.

Note: Zoltar sometimes predicts a 0-point margin of victory. In those situations, to pick a winner so I can compare against Cortana/Bing, in the first four weeks of the season, Zoltar picks the home team to win. After week 4, Zoltar uses historical data for the current season (which usually ends up in a prediction that the home team will win).

zoltarspeaks

Posted in Machine Learning

Top Ten Software Developer Conferences in 2017 for .NET Developers

The title of this blog post needs a quick explanation. There are hundreds of conferences every year that are somehow related to software development. This list is aimed at software developers who work primarily (but not necessarily exclusively) with Microsoft technologies, and who are based in the U.S.

devconnhallwaysmall


1. Developer Week February 11-16, 2017. San Francisco. http://www.developerweek.com. I’ve never been to this event but it looks to be a collection of smaller events. Worth checking out.


2. Visual Studio Live March 13-17, 2017. Las Vegas. http://www.vslive.com. VS Live is a series of smaller (a few hundred attendees) events in seven cities throughout the year. The Las Vegas edition is my favorite. I’ll be speaking there this year. If you’re a .NET developer you should definitely give this conference strong consideration. Recommended.


3. Devoxx March 21-23, 2017. San Jose. https://devoxx.us. This fairly big conference has a big variety of talks but most are related directly or indirectly to Java.


4. OSCON May 8-11, 2017. Austin, Texas. http://www.oscon.com. The major Open Source event. This event always has a weird feel to it. In addition to tech talks there are quite peripheral topics. I usually speak at OSCON, but won’t this year.


5. Microsoft Build May 10-12, 2017. Seattle. https://build.microsoft.com. This is a very big event, typically about 7,000 people. The plus side is every imaginable Microsoft topic is covered. The negative side is there’s a lot of marketing chit-chat. I often speak at Build but don’t think I’ll do so this year.


6. PyCon May 17-25, 2017. Portland, Oregon. https://us.pycon.org/2017. This Python conference is a low-key, run-on-a-budget event but that’s part of its appeal to me. The use of Python is increasing rapidly, especially in conjunction with machine learning.


7. DevIntersection May 21-24, 2017. Orlando. https://www.devintersection.com. This event covers all Microsoft related development. One of my three favorite events. In addition to the Orlando edition, there will be a Las Vegas edition in November. I prefer Vegas because it’s much closer to me. Highly recommended.


8. Microsoft Ignite September 25-29, 2017. Orlando. https://ignite.microsoft.com. This huge event is aimed more at IT engineers who do development rather than pure developers. When I speak at Ignite it’s usually about PowerShell development. Pros: a huge event. Cons: a huge event.


9. JavaOne October 1-5, 2017. San Francisco. https://www.oracle.com/javaone. This is the major Java event. Like many developers, I’m not entirely happy with the way Oracle is handling Java, but this is still the event for Java developers.


10. DevConnections October 23-26, 2017. San Francisco. http://www.devconnections.com. This is a long-running event and one of my favorites. The event moves to San Francisco after many, many years in Las Vegas. Definitely worth investigating.


VSLiveBigBoard

Posted in Conferences

The Missing Square Trick and Evil Ed

My old college roommate Ed is evil. But in a good way.

In my undergraduate days at UC Irvine, one of my roommates was Ed K. We went through all kinds of adventures together at school, in Las Vegas, in Miami, and working at Disneyland. He’s still one of my very best friends.

But Ed knows exactly how to distract me. Ed sent me a link to a fascinating video of a magic trick that is related to geometry. Ed knows I love math and magic, and so I spent an hour playing and replaying the video to try and figure out how it was done. Evil Ed.

I still don’t know how the trick was done, but it reminded me of The Missing Square Puzzle Trick. Look at the image below closely. The top multi-triangle (made of four shapes) and the bottom multi-triangle (made of the exact same four shapes) both have area of 13 wide by 5 high. But the bottom multi-triangle has a missing 1×1 square! Where did it go?

themissingsquarepuzzletrick

The Wikipedia entry at https://en.wikipedia.org/wiki/Missing_square_puzzle explains the trick. Anyway, I think this may have something to do with the magic trick video Evil Ed sent me.

Posted in Miscellaneous

Introduction to Microsoft CNTK Machine Learning Tool

I wrote an article titled “Exploring the Microsoft CNTK Machine Learning Tool” in the January 2017 issue of Microsoft MSDN Magazine. See https://msdn.microsoft.com/en-us/magazine/mt791798.

cntk_demorun

When I wrote my article several weeks ago, CNTK stood for “Computational Network Toolkit”. Weirdly, the name was changed to the “Microsoft Cognitive Toolkit” but the acronym is still CNTK. You gotta love Marketing people.

Anyway, CNTK is a command line program that can do regular and deep neural network analyses. Because CNTK was originally developed as an internal Microsoft tool, and because it’s under very rapid development, the existing documentation is pretty weak because it’s incomplete and lags behind the code base. So my goal was to write a complete end-to-end tutorial.

CNTK is a direct competitor to Google’s TensorFlow tool. I’ve used both TensorFlow and CNTK, and I prefer CNTK. Both tools have a ton of room for improvement, primarily in the area of documentation. But CNTK just has a bit of a nicer feel to me. Of course this is partially due to the fact that CNTK is mostly a Windows tool and TensorFlow runs on Ubuntu, and I work much more often on Window than on Linux.

neuralarchitecture

To use CNTK, you write a fairly complex configuration file that specifies where the data is, the architecture of the neural network to use, various parameters of the training routine, and so on. Creating a CNTK configuration file is not easy, but CNTK is a powerful tool for difficult problems.

CNTK: two technical thumbs-up.

Posted in Machine Learning | 1 Comment

NFL 2016 Week 18 Predictions – Zoltar Agrees with Las Vegas

Zoltar is my NFL prediction computer program. Here are Zoltar’s predictions for week 18 (wild card playoff games) of the 2016 NFL season:


Zoltar:     raiders  by    0  dog =      texans    Vegas:      texans  by  3.5
Zoltar:    seahawks  by    6  dog =       lions    Vegas:    seahawks  by    8
Zoltar:    steelers  by    6  dog =    dolphins    Vegas:    steelers  by   10
Zoltar:     packers  by    6  dog =      giants    Vegas:     packers  by  4.5

Zoltar theoretically suggests betting when the Vegas line is more than 3.0 points different from Zoltar’s prediction. For week 18 Zoltar has two suggestions.

1. Zoltar (sort of) likes the Vegas underdog Raiders against the Texans. Zoltar thinks the two teams are evenly matched but Vegas says the Texans will win by 3.5 points. The Raiders QB is out which means this game is very difficult to predict. An enhanced version of Zoltar I use that takes injuries into account says the Texans will win by 1.0 point, so no bet is recommended by enhanced Zoltar.

2. Zoltar likes the Vegas underdog Dolphins against the Bills. The basic Zoltar thinks the Steelers are just 6.0 points better than the Dolphins but Vegas says the Steelers will win by 10.0 points. The enhanced Zoltar thinks the Steelers will win by 8.0 points so that Zoltar does not recommend a bet.


In week 17, Zoltar went 6-3 against the Vegas point spread. There were many big point spread changes in mid-week (related to injuries and players sitting out).

For the 2016 regular season, Zoltar finished 43-28 against the Vegas spread, for 61% accuracy. Historically, Zoltar is usually between 62% and 72% accuracy against the Vegas spread over the course of an entire season, so this was only an OK season for Zoltar.

Theoretically, if you must bet $110 to win $100 (typical) then you’ll make money if you predict at 53% accuracy or better. But realistically, you need to predict at 60% accuracy or better, so from that perspective Zoltar was successful.

Just for fun, I track how well Zoltar and Cortana/Bing Predictions do when just trying to predict just which team will win a game. This isn’t useful except for parlay betting.

In week 17, just predicting winners, Zoltar was a good 13-3. Cortana/Bing was also 13-3 just predicting winners (I have no idea how Cortana knew the Eagles would beat the Cowboys).

For the 2016 season, just predicting winners, Zoltar finished 174-80 (69% accuracy). Cortana/Bing finished 162-92 (64% accuracy). There were two tie games in the season, which I didn’t include.

Note: Zoltar sometimes predicts a 0-point margin of victory. In those situations, to pick a winner so I can compare against Cortana/Bing, in the first four weeks of the season, Zoltar picks the home team to win. After week 4, Zoltar uses historical data for the current season (which usually ends up in a prediction that the home team will win).

zoltar_predicts_nfl

Posted in Machine Learning

The MS Bot Framework

I took a look at the MS Bot Framework recently. The Bot Framework is basically pre-built skeleton code that allows a human user to send typed messages to a Web service, and the service will respond.

It’s not magic. You have to supply all the logic for the conversation. However, the Bot Framework does supply all the scaffolding code which saves a huge amount of time, and the Framework has hooks to some pretty cool libraries of AI code, such as the MS Cognitive Services.

In my micro-demo, I created a bot that can answer questions about this week’s NFL football games. I used the logic from my Zoltar football prediction program.

zoltarbot

The demo was created using Visual Studio with the C# language (JavaScript via NodeJS is also supported if you’re a glutton for technical punishment). Once the bot service was running I used a nice little add-on program called the Bot Framework Emulator to send messages to my locally running ZoltarBot.

Bottom line: the MS Bot Framework isn’t anything radically new — it’s just a Web service — but it does nicely package code that greatly simplifies creating a bot. I give the Bot Framework a thumbs-up.

Posted in Machine Learning

Factor Analysis

Factor analysis is a classical statistics technique that isn’t used too much in machine learning but it can be quite valuable. As is often the case with statistics and ML, it’s a bit tricky to explain what factor analysis is without going into a huge amount of detail.

Briefly, if you have a dataset that has many variables, factor analysis can tell you if some of the variables are actually due to a hidden, latent variable. The idea is best explained by an example.

Suppose you ask a bunch of people to rate 8 movies on a scale of 1 (bad) to 5 (excellent). The movies are The Fifth Element, Forbidden Planet, Dark City, Galaxy Quest, The Hangover, Meet the Parents, Ben Hur, and Gladiator.

The raw data might look like:

P01,5,4,2,3,1,2,4,5
P02,2,1,5,5,1,1,4,3
etc.

This means person 01 gives The Fifth Element a rating of 5, gives Forbidden Planet a rating of 4, and so on.

In this example, I’ve deliberately set up the problem so that there are three latent variables that explain the data – science fiction, comedy, and historical.

To do factor analysis in R, you can use the somewhat unfortunately named “factanal” function. If a data frame named dd holds the numeric part of the data, then you could call

fact3 = factanal(dd, factor=3)

to see how well three latent variables fit the data. The results for my dummy data are:

                Factor1 Factor2 Factor3
TheFifthElement  0.757          -0.355 
ForbiddenPlanet  0.977  -0.134         
TheHangover              0.940  -0.177 
MeetTheParents  -0.166   0.800  -0.218 
BenHur          -0.204  -0.254   0.915 
Gladiator       -0.224  -0.498   0.654 
GalaxyQuest      0.585   0.606  -0.435 
DarkCity         0.785          -0.240

Notice “Factor1” (which is science fiction) captures The Fifth Element, Forbidden Planet, Galaxy Quest, and Dark City extremely well. “Factor2” captures comedies The Hangover, Meet the Parents, and Galaxy Quest. And “Factor3” captures historical movies Ben Hur and Gladiator.

The SS Loadings output gives you a rough idea of how important each factor is:

               Factor1 Factor2 Factor3
SS loadings      2.607   2.222   1.724

As a rule of thumb, if an SS loading is greater than 1.0 the factor is relevant. If I ran the analysis with four factors, the SS loading for Factor4 would likely be less than 1.0 showing that it’s not important.

The chi-square p-value is the probability that the factors explain the data variability perfectly, so higher values are better. In my demo, the p-value is 0.1970.

Factor analysis is typically used with lots of variables. The you have v variables and f factors, then (v-f)^2 must be greater than v+f.

factoranalysisusingr

Posted in Machine Learning, R Language