March Madness and Machine Learning

As I write this blog post, I’m at the 2017 Visual Studio Live Conference in Las Vegas. By coincidence, the 2017 NCAA college basketball tournament (“March Madness”) started just a few minutes ago.

I’ve always been fascinated by things related to probability and prediction. For example, every year I write a computer program that predicts the outcomes of NFL football games. In my mind, machine learning is any system that uses data to make some sort of a prediction, so predicting outcomes related to March Madness is a possible machine learning problem.

There is huge interest in the NCAA basketball tournament. By that I mean an enormous amount of money is bet on the games. The best estimate I’ve seen indicates that people will wager approximately $10.0 billion dollars on March Madness over the next few weeks. That’s “billion” with a “b”.

A good friend of mine (PW) knew that I was in Vegas and so he sent me an e-mail message and asked me to place a $100 bet on UCLA for a friend of his (DL) to win the tournament. DL picked UCLA because that’s where he went to school. The approximate Vegas odds of UCLA winning the tournament are about 12 to 1 so if UCLA wins my friend’s friend will win about $1200.

So, I walked across the street from Bally’s (where VS Live is) to the Bellagio Hotel, which has a big “sports book” (betting operation) to place the bet. There were hundreds of people there and tremendous energy, and the first game (Notre Dame vs. Princeton) had just tipped off. It was very exciting.

Now the interesting thing here is that the current odds are determined by how much people bet on each team. And for March Madness, people often tend to bet with their hearts (i.e., the school they went to) rather than their heads. This likely creates imbalances in the odds that could be taken advantage of with a sophisticated machine learning system. I wish I had time to explore such a prediction system, but I don’t.

Posted in Machine Learning | 2 Comments

Loading a Text File into a Python Matrix

My Python programming had gotten a bit rusty lately so I’ve been brushing up by doing short demo programs. In machine learning, a common task is to load a text file containing numbers into a matrix. So I wrote a demo that does just that, using some data from the well-known Iris Dataset.

I had a text file named testData.txt that looks like this:

5.0,3.5,1.3,0.3,1,0,0
4.5,2.3,1.3,0.3,1,0,0
. . .
5.9,3.0,5.1,1.8,0,0,1

There are 30 rows. The key function is defined:

def loadFile(df):
  # load a comma-delimited text file into an np matrix
  resultList = []
  f = open(df, 'r')
  for line in f:
    line = line.rstrip('\n')  # "1.0,2.0,3.0"
    sVals = line.split(',')   # ["1.0", "2.0, "3.0"]
    fVals = list(map(np.float32, sVals))  # [1.0, 2.0, 3.0]
    resultList.append(fVals)  # [[1.0, 2.0, 3.0] , [4.0, 5.0, 6.0]]
  f.close()
  return np.asarray(resultList, dtype=np.float32)  # already float32
# end loadFile

Parameter df (“data file”) is the path to th file. I walk through each line and a.) strip away the trailing newline using rstrip(), then b.) separate the comma-delimited values into a list of strings using split(), c.) convert the list of strings to a list of float32 using map(), and then d.) append the current row-list to the overall result list. Because each row-list is already my desired type float32 I didn’t have to specify the dtype in asarray() but I did so for clarity.

After all the rows have been processed, the list-of-lists is converted to a NumPy 30×7 matrix using the asarray() function, which is returned.

There are more efficient ways to load a text file into a matrix, but this technique is fine for simple scenarios.

Posted in Machine Learning | Leave a comment

A First Look at Visual Studio 2017

Visual Studio (VS) is the most common tool used by software developers who use Microsoft technologies. The most recent version, VS 2017, was released a few days ago. Because I use VS a lot, I figured I’d better check out the new VS to see how it compares with the previous VS 2015.

VS is an extremely complex program and can take many, many months or even years to master. So, my quick investigation was meant only to get a feel for what is new. The biggest change seems to be that VS can now create programs that use the new “.NET Core” framework. Loosely speaking, the classic .NET Framework, which was released in 2002, is a huge library of code. The new .NET Core framework is more modular and is open source.

I did a quick Console Application that runs in a shell, using the .NET Core and my reaction was pretty much a yawn — sure there were lots of changes but nothing I can’t figure out. I believe VS 2017 will continue to be a great tool (much better than, say, Eclipse) for creating desktop applications and code libraries, using the C# language.

And then I gritted my teeth and took a look at the nightmare called ASP.NET to create a simple Web page. The history of ASP.NET is a sad one. Microsoft can never seem to get it quite right while less sophisticated platforms like PHP and NodeJS keep things relatively simple and gain market share. First there was classic ASP, then Web Forms with ASP.NET, then Web API and MVC, and then there was Razor, and on and on and on. Each change to ASP.NET required a massive investment in time to learn. I mean, really, Web applications aren’t that complicated! It’s response-request for crying out loud!

And now we have ASP.NET Core. I wish I could be optimistic and say something like, “At last! Simple and effective and I’m sure it’ll be stable for at least a few years!”

I wish I could say that. But my initial reaction was, “Here we go again.”

Of course, after only a few hours of poking around with ASP.NET Core, I could well be completely wrong, but my initial experience wasn’t good. My problem with ASP.NET isn’t the technology — which is actually pretty awesome. It’s the constant and infuriating flux and platform instability which creates mountains of irrelevant and apparently contradictory documentation (because you can’t always be sure about which version you’re dealing with).

Bottom line: VS 2017 looks good. Many of the new features are related to the overall development process (such as collaboration and integrated testing). I haven’t formed an opinion on the .NET Core yet, but the Web part of Core, ASP.NET Core, feels overly complex.

Posted in Miscellaneous

My Top Ten Favorite Snowy Science Fiction Movies

There are quite a few science fiction films that take place in the Arctic (North Pole) or the Antarctic (South Pole), or just generally snowy environments. Here are my top 10 favorites.


1. Dreamcatcher (2003) – A really strange but interesting film that takes place almost entirely in snowy settings. The plot involves an alien invasion, a crazy U.S. Army colonel (Morgan Freeman), parasites, mind control, good aliens, and a mentally retarded hero played by Donnie Wahlberg.


2. Horror Express (1972) – In the early 1900s, scientists played by actors Christopher Lee and Peter Cushing are on a train in Siberia in winter. Also on the train is an alien who isn’t very nice. Scary and effective film.


3. The Thing from Another Planet (1951) – Science researchers at the North Pole discover a flying saucer buried in the ice. And a frozen alien. Excellent screenplay, and this film holds up well more than 65 years after its release.


4. The Thing (1982) – Researchers at the South Pole. An alien that can assimilate other creatures. If you’ve seen the film you probably remember the blood test scene. This film has a very confusing plot and a somewhat ambiguous ending.


5. The Crawling Eye (1958) – A group of people in the Swiss Alps run into aliens that look like huge eyeballs with tentacles. Surprisingly effective plot, sound effects, and acting.


6. The Thaw (2009) – Scientists (including one played by Val Kilmer) in the Canadian Arctic come into contact with deadly parasites. Rather creepy and scary film.


7. Snowpiercer (2013) – Bizarre film. After the earth goes into an ice age, remnants of society are on an atomic powered train. There is a caste system, and a character played by Chris Evans leads a revolt of the poor people.


8. Ice Soldiers (2013) – Scientists in the Canadian Arctic discover three frozen Soviet-era, genetically enhanced soldiers. One of the Canadian scientists has suspiciously enhanced physical and mental powers. Hmm.


9. Europa Report (2013) – A rescue mission to Europa (a frozen moon of Jupiter). Excellent special effects and realistic, but very slow moving.


10. Alien vs. Predator (2004) – The film title pretty much says it all. Takes place in the Antarctic.


Special Mention

One of the least-known films of all time is “Distant Early Warning” (1975). At an isolated radar station in the Arctic, during a fierce snowstorm, there’s a knock on the door! It is the decreased relative of one of the radar operators who says there is a heaven and he’s come back to visit! This was a made-for-TV film shown on the ABC Wide World of Mystery anthology. There is virtually nothing known about this film. I saw it the one time it was shown and it made a huge impression on me.

Posted in Top Ten

Neural Networks with Raw Python and NumPy

I’ve been brushing up on my Python programming language skills. One thing I like to do with any language is implement a simple feed-forward neural network. The code to create a neural network uses all the basic control structures and language features (if-then, for-loops, while-loops, string concatenation, matrices, arrays, etc.)

So, I tackled a neural network with a back-propagation plus momentum training algorithm. In addition to being a great way for me to get back in tune with Python, I now have an experimentation platform to investigate new algorithms related to neural networks. For example, both the Microsoft CNTK ad Google TensorFlow code libraries have a relatively new (since about 2015) optimization algorithm called Adam (“Adaptive Moment Estimation”) that is very fast compared to basic stochastic gradient descent optimization.

I found an excellent blog post by a student named Sebastian Ruder that gave the best explanation of the Adam algorithm I’ve seen, at http://sebastianruder.com/optimizing-gradient-descent/index.html#adam. But I still won’t be fully satisfied I understand Adam until I implement it inside my Python neural network code.

I’m fairly experience with neural networks, but there are continuous new algorithms that appear ever few months. It’s a very exciting time to be involved with machine learning.

Posted in Machine Learning

Python Matrices

I’ve been using the Python programming language increasingly often over the past few months, mostly because two powerful machine learning libraries — Google’s TensorFlow and Microsoft’s CNTK — have Python interfaces.

For both tools, a solid knowledge of Python is essential so that the libraries can be customized to fit the problem at hand. In many situations this means reading data into a vector, or a matrix, or an n-dimensional array.

There are two basic ways to create a (two-dimensional) matrix in Python. You can use a built-in list-of-lists approach, or you can use the ndarray type from the NumPy (“numerical Python”) add-on package. So, here’s my personal reminder of the difference between the two.

A typical 3×4 list-of-lists style matrix is:

nRows = 3
nCols = 4
matrix_ll = [[0.0 for j in range(0, nCols)]
  for i in range(0, nRows)]
# matrix_ll[1,2] = 5.0  # error
matrix_ll[1][2] = 5.0
print("list-of-lists matrix: ")
print(matrix_ll)

There are of course zillions of alternatives. Using a NumPy approach:

import numpy as np

matrix_np = np.zeros(shape=[nRows,nCols],
  dtype=np.float)
matrix_np[1,2] = 5.0  # my preferred indexing style
matrix_np[2][3] = 7.0  # list-like indexing style works too
print("numpy matrix: ")
print(matrix_np)

The NumPy approach is basically better, with the exception that you take a dependency on the NumPy package.

Posted in Machine Learning

Bengio and Deng Visit my Machine Learning Talk

I was giving a short talk on machine learning at Microsoft last week and the talk was visited by Yoshua Bengio and Li Deng. If you work in ML or AI you probably know these two names. Bengio (University of Montreal) and Deng (Microsoft Research) are two of arguably the five best-known names in deep learning. The other three top DL names are Andrew Ng (Stanford), Geoffrey Hinton (University of Toronto), and Yann LeCun (New York University).

Bengio was very gracious and answered questions from the people in attendance. In addition to being a really smart guy, Bengio is very articulate and an excellent presenter. Li Deng is also a very strong presenter.

In the picture below, I am farthest to the right, Deng has his back to the camera, and Bengio is taking off his coat so he can address the attendees.

The visit by Bengio and Li reminded me that I’ve been very fortunate to have met some super smart people over the years. Many of the people I bump into at Microsoft Research are incredibly intelligent. And I was greatly influenced by several of my UC Irvine university professors, including super-famous Edward O. Thorp (see https://en.wikipedia.org/wiki/Edward_O._Thorp) and Nobel-winning Sherwood Roland (https://en.wikipedia.org/wiki/F._Sherwood_Rowland).

In the area where I live, there are quite a few sports stars that I see at places like the local Starbucks and grocery store. And when I go to Las Vegas to speak at conferences, it’s not unusual for me to bump into famous movie stars. But, I never really get excited by people in those fields. For me, an exciting encounter is with someone who is famous for their intellectual achievements.

Posted in Conferences, Machine Learning