Baby Shoes, Mathematics, and Simulation

I ran across an image on the Internet recently that intrigued me. A device titled “Baby Shoes”.


Baby Shoes machine

The URL link associated with the image was dead but it looks like an old countertop gambling machine, perhaps from the 1930s or 40s. Apparently you put a nickel in the machine, then it rolls five dice.

If the sum of the dice is 25 you win “Two Packs”. If the sum is 8, you win “Six Packs”, and so on:

Sum of Dice   Prize
===========================
    25        Two Packs
    10        Two Packs
     9        Four Packs
     8        Six Packs
     7        Ten Packs
     6        Twenty Packs
     5        Forty Packs
    30        Forty Packs

In old gambling games like this, the winning items are often disguises for money, to avoid gambling laws. For example, maybe “Six Packs” supposedly means six packs of chewing gum but really means six dollars or six half-dollars or something like that.

I love calculating probabilities so I wondered what the probabilities were for this game. I’ve done problems like this many times, but I know from experience these math problems are extremely tricky and it’s very, very easy to make a mistake.


Simulation program.

When I see a problem like this now, my method of choice is to write a simulation program instead of doing the combinatorial mathematics.

Using 1,000,000 simulated plays, my results were:

Frequency of sum =  25: 0.0162
Frequency of sum =  10: 0.0161
Frequency of sum =   9: 0.0090
Frequency of sum =   8: 0.0045
Frequency of sum =   7: 0.0019
Frequency of sum =   6: 0.0006
Frequency of sum =   5: 0.0001
Frequency of sum =  30: 0.0001

Frequency of other sum: 0.9514

As I suspected, the payoffs mostly make sense but aren’t entirely mathematically consistent. For example, The probability of getting a sum = 9 (p = 0.0090) is twice that of getting a sum = 8 (p = 0.0045) but the payoff for getting a sum = 8 isn’t twice as much as getting a sum = 9.

The name “Baby Shoes” is a bit odd, but luckily I love old movies and I’ve watched scenes where a gambler is about to roll a pair of dice and says, “Come on! Baby needs new shoes!” to encourage a good result so he can win and buy his baby a pair of shoes.

Part of my lifelong love affair with mathematics and computers is due to writing simulations programs like this when I was in college. I can still vividly remember writing a Craps dice game simulation during my undergrad days at U. C. Irvine. Baby Shoes is an easy problem but there are many types of problems that just can’t be solved using standard mathematics because they’re too complex, and a simulation is the only feasible approach. For example, imagine a dice game with a set of 100 dice.



Former first lady Michelle Obama (left) and current first lady Melania Trump (center) shown getting off Air Force One, both wearing tennis shoes. My hunch is both of them have plenty of shoes. I’d say the girl on the right could use a new pair of shoes, geek-appeal of her current pair made of Legos nonwithstanding.

Advertisements
Posted in Miscellaneous | Leave a comment

Introduction to the ML.NET Library

I wrote an article titled “Introduction to the ML.NET Library” in the October 2018 issue of MSDN Magazine. See https://msdn.microsoft.com/en-us/magazine/mt830377.

Most machine learning code libraries are written in C++ (for performance) but have a Python language interface for programming convenience. But many application programs for machines running the Windows operating system are written in C# and transferring a trained model from Python to C# is usually awkward and often quite difficult. The ML.NET library is written in C# so a developer can use the code library directly in his application program.

In my article, I show how to use ML.NET to create a prediction model using logistic regression — arguably the simplest machine learning algorithm. Specifically, I show an synthetic-data example where the goal is to predict whether a patient will die or survive, based on their age, sex, and score on a kidney test.

The ML.NET library is still in preview mode and it has a lot of rough edges. As I point out in my article, ML.NET is based on a library named TLC which is used internally at Microsoft. And TLC is based on an earlier library named TMSN which was created in 2002 when the .NET Framework was new.

It will be interesting (well, to me anyway) to see what happens with the ML.NET library. Will it become popular and widely used? Or will it, like most code libraries, never really catch on and eventually fade away?



Kidney-shaped swimming pools in Palm Springs, California. People tend to either like Palm Springs or find the town too hot and boring. I enjoy visiting Palm Springs for several reasons including the wonderful mid-century modern architecture and the alien-like beauty of the desert.

Posted in Machine Learning | Leave a comment

Police Officers Killed in the Line of Duty and Machine Learning

It’s not uncommon these days to read a sad news story where a police officer is killed in the line of duty. This topic has special meaning to me because a good friend of my father’s was a police officer in Kingman, Arizona when he was killed in a robbery attempt. I remember meeting him when I was young and him showing me his service revolver. Also, several men on my father’s side of the family served as police officers, and in fact, my father passed down to me a .38 Long Colt model 1892 service revolver from one of his uncles.

Anyway, to honor the policemen and women who have been killed in the line of duty so far this year (January through September 2018), and to put human faces on top of cold statistics, I did an Internet search for officers killed in the line of duty. Here are the ones I was able to find information about. Information was surprisingly difficult to find. There were several other murders of police, but I couldn’t find solid info about them.

It’s my hope and dream that some day machine learning can be used to help prevent murders like the ones shown here.



Eric Joering and killer



Officer Amy Caprio and killer



Officer Glenn Doss and killer



Officer Paul Bauer and killer



Officer Justin Billa (with wife and child) and killer



Officer Steven Belanger and killer



Officer Mark Baserman and killer



Officer Chase Maddox and killer



Officer Christopher Morton and killer



Officer David Sherrard and killer



Officer Anthony Morelli and killer



Officer Heath Gumm and killer



Officer Patrick Rohrer and killer



Officer Michael Chesna and killer



Officer Joseph Gomm and killer



Officer Tyler Edenhofer (with his mother) and killer



Officer James White and killer



Officer Michael Michalski and killer



Officer Theresa King and killer



Officer Adam Edward Jobbers-Miller and killer



Officer Timothy Cole and killer.jpg



Officer Fadi Shukur and killer



Officer Armando Gallegos and killer



Officer Garrett Hull and killer



Officer Mark Stasyuk and killer



Officer Zach Moak and killer



Officer Kevin Connor and killer



Officer Sean Bolton and killer


Posted in Machine Learning, Miscellaneous | 4 Comments

Installing TensorFlow and Keras on Windows without a Live Internet Connection

I’ve taught workshops on TensorFlow and Keras several times. I always like to start by going through the installation process because it’s not trivial and the process reveals several ideas that are very important.

I’ve got a workshop coming up with 100 people. I know from past experience that 100 people trying to download and then install TF/Keras at the same time over a wireless network connection just won’t work. The files are big and the installation process actively goes out to the Internet to grab dependencies.

So, I set out to learn how to install TF/Keras on Windows, without being connected to the Internet. How difficult could it be?

Approximately 12 hours later I knew exactly how difficult it could be. Very.

Briefly, I downloaded the Anaconda3 self-extracting executable. Then I discovered and downloaded 8 TF/Keras dependencies as WHL files. Then I did a preliminary installation to get access to three dependencies that didn’t have WHL files. All these files (and three pre-built directories) can be placed on a USB drive, and then it’s possible to install TF/Keras without an active Internet connection.


These are the files and directories needed to install TF/Keras on Windows without an active Internet connection

1. Double-click on the Anaconda3 5.2.0 self-extracting installer. This will install Python 3.6.5 and approximately 500 Python packages. It will also create a directory at

C:\Users\(user)\AppData\Local\Continuum\anaconda3\Lib\site-packages

2. Use “pip install” to install these 8 WHL packages that were downloaded earlier:

1. protobuf             3.6.1
2. grpcio               1.10.0
3. Markdown             3.0.1
4. absl_py              0.6.0
5. msgpack              0.5.6
6. astor                0.7.1
7. Keras_Applications   1.0.6
8. Keras_Preprocessing  1.0.5

3. Copy these five pre-built package directories and one file into the site-packages directory:

gast
gast-0.2.0.dist-info

tensorboard
tensorboard-1.11.0.dist-info

termcolor-1.1.0.dist-info
termcolor.py

4. At last, install TensorFlow 1.11.0 using pip install.

5. TF now comes with a version of Keras but you can install Keras 2.2.4 using pip install if you wish.

6. Even after all this, there’ll be a couple of glitches. For me, I had to use pip install to upgrade my h5py package to version 2.8.0 to avoid an annoying warning message.



First part — installing WHL based packages (after installing Anaconda)



Last part of install (after copying pre-built directories for gast, tensorboard, termcolor)

I think the moral of the story is that working with machine learning libraries is still a relatively new activity. Compared to modern applications development using Java or C#, even simple things like ML library installation and configuration are quite primitive.

Posted in Keras, Machine Learning

NFL 2018 Week 9 Predictions – Zoltar Likes Vegas Favorites Panthers, Chiefs, Patriots

Zoltar is my NFL prediction computer program. It uses a deep neural network and reinforcement learning. Here are Zoltar’s predictions for week #9 of the 2018 NFL season:

Zoltar: fortyniners  by    4  dog =     raiders    Vegas: fortyniners  by  3.5
Zoltar:       bills  by    2  dog =       bears    Vegas:       bears  by    1
Zoltar:    panthers  by   10  dog =  buccaneers    Vegas:    panthers  by  6.5
Zoltar:      chiefs  by   12  dog =      browns    Vegas:      chiefs  by    8
Zoltar:    dolphins  by    6  dog =        jets    Vegas:    dolphins  by    3
Zoltar:     vikings  by    6  dog =       lions    Vegas:     vikings  by  5.5
Zoltar:    steelers  by    0  dog =      ravens    Vegas:      ravens  by    3
Zoltar:    redskins  by    4  dog =     falcons    Vegas:    redskins  by  1.5
Zoltar:      texans  by    0  dog =     broncos    Vegas:     broncos  by  2.5
Zoltar:    seahawks  by    2  dog =    chargers    Vegas:    seahawks  by  1.5
Zoltar:        rams  by    0  dog =      saints    Vegas:      saints  by    1
Zoltar:    patriots  by   10  dog =     packers    Vegas:    patriots  by  6.5
Zoltar:     cowboys  by    3  dog =      titans    Vegas:     cowboys  by  6.5

Zoltar theoretically suggests betting when the Vegas line is more than 3.0 points different from Zoltar’s prediction. For week #9 Zoltar has four hypothetical suggestions.

1. Zoltar likes the Vegas favorite Panthers against the Buccaneers. Zoltar thinks the Panthers are 10 points better than the Buccs but Vegas has the Panthers as favorites by only 6.5 points. Therefore, Zoltar believes the Panthers will cover the spread in betting terminology.

2. Zoltar likes the Vegas favorite Chiefs against the Browns. Zoltar thinks the Chiefs are 12 points better than the Browns but Vegas has the Chiefs favored only by 8.0 points. I suspect that humans are over-reacting to the Brown firings their head coach yesterday, somehow thinking the Browns will play above their ability for their new coach. We’ll see.

3. Zoltar likes the Vegas favorite Patriots against the Packers. Zoltar thinks the Patriots are 10 points better than the Packers, but Vegas has the Patriots favored by only 6.5 points.

4. Zoltar likes the Vegas underdog Titans against the Cowboys. Zoltar thinks the Cowboys are 3 points better than the Titans but Vegas thinks the Cowboys are 6.5 points better than the Titans. A bet on the Titans will pay off if the Titans win (by any score) or if the Cowboys win by less than 6.5 points (in other words, 6 points or less).

Theoretically, if you must bet $110 to win $100 (typical in Vegas) then you’ll make money if you predict at 53% accuracy or better. But realistically, you need to predict at 60% accuracy or better.

Just for fun, I track how well Zoltar does when just trying to predict just which team will win a game (not by how many points). This isn’t useful except for parlay betting.

Zoltar sometimes predicts a 0-point margin of victory. There are three such games in week #9: Steelers-Ravens, Texans-Broncos, Rams-Saints. In the first four weeks of the season, Zoltar picks the home team to win. After week #4, Zoltar uses historical data for the current season (which usually, but not always, ends up in a prediction that the home team will win).

==

Zoltar did fairly well in week #8. Against the Vegas point spread, Zoltar was 3-2. For the season so far, against the Vegas spread Zoltar is 26-15 which is about 63% accuracy.

Just predicting winners, Zoltar was a pretty good 11-3. Vegas also went 11-3 just predicting which team would win. For the season, just predicting which team will win, Zoltar is 86-33 (just above 72% accuracy) and Vegas is 84-33 (just below 72% accuracy).



My system is named after the Zoltar fortune teller machine (left). Some guy created an amazing motorized Zoltar machine he drives for parades (two center photos). Everyone loves Zoltar, even fake Zoltars (right).

Posted in Machine Learning, Zoltar

Recap of the 2018 Money 20/20 Conference

I gave a short talk titled “Understanding Deep Neural Networks for Finance” at the 2018 Money 20/20 Conference. The event ran October 21-24 and was in Las Vegas. I estimate the conference had about 12,000 attendees, speakers, staff, visitors, and exhibitors.

The conference covered all things related to technology and finance. I recognized many of the big-name finance companies that were represented at the event: Bank of America, Wells Fargo, J.P. Morgan, PayPal, and so on. The big technology companies were there too: Amazon, Goggle, IBM, and others. And there were many smaller companies (the event program listed over 500).


The conference Web site used some sort of clever software to convert photos of the speakers into cartoons. Sadly, the cartoon version of me (top row, right) looks better than the real me. Sigh. Click to enlarge to see me in cartoon magnificence.

The conference organizers said my talk had about 300 people attending. I was one of four speakers in a meta-session who talked about AI and machine learning. I just talked about technology, specifically deep neural networks, LSTM networks, and deep RL systems. The other three speakers represented companies: Feedzai (innovative fraud detection), Kasisto (conversational bots for banking), and Upstart (credit analysis).


Room before and during my talk.

Attendees told me that Money 20/20 is the largest conference that targets finance technology. The Expo had hundreds of interesting booths and exhibits. To the best of my knowledge, this was the first year where there was a dedicated track for AI and machine learning (although AI and ML have been represented at the event for years now).

At most of the technology conferences I speak at, educational-style talks are the main activities, and expos are relatively minor in importance. But at Money 20/20, the Expo was the main activity, along with hundreds of deal-making micro-events everywhere. This makes sense: the field of finance is pretty cut-and-dried and so this conference primarily acts as a huge get-together where companies can meet with each other in a very efficient way.


The Expo had many booths related to machine learning in finance. And there were guys doing deal-making everywhere, including unused Blackjack tables!

The biggest technical theme I saw at the conference was the attention paid to financial fraud. The numbers are truly scary and detecting and defeating financial fraud is a huge challenge. One of my technical conclusions is that fraud probably has to be detected by looking at the complete chain of a financial transaction as a whole, rather than looking at the individual parts of a transaction.

Another theme I noticed is that financial services are becoming increasingly commoditized and all companies are actively looking for ways to differentiate themselves. Integrating machine learning and AI systems is an important way that companies can use to add value to their products and services and achieve differentiation.

Money 20/20 — highly recommended for anyone who works in a financial environment. See https://www.money2020.com/

Posted in Conferences, Machine Learning

Gibbs Sampling Example Using a Discrete Distribution

Gibbs sampling is a very complex topic because it involves about half a dozen ideas in probability, each of which is very complex. It’s not possible to completely understand Gibbs sampling with a single example. You need to look at several examples. Here’s one that uses a discrete distribution instead of the usual example with a continuous (Gaussian) distribution.

Suppose you have a coin and a spinner, and that they’re connected in some mysterious way so that the result of one depends upon the previous result of the other.

Now suppose that, unknown to you, the true joint probability distribution is:

P(Heads and 0) = 0.10
P(Heads and 1) = 0.20
P(Heads and 2) = 0.10
P(Tails and 0) = 0.30
P(Tails and 1) = 0.10
P(Tails and 2) = 0.20

But suppose you somehow know the two conditional probability distributions:

P(0 | Heads) = 1/4
P(1 | Heads) = 1/2
P(2 | Heads) = 1/4

P(0 | Tails) = 1/2
P(1 | Tails) = 1/6
P(2 | Tails) = 1/3 

---------------------

P(Heads | 0) = 1/4
P(Tails | 0) = 3/4

P(Heads | 1) = 2/3
P(Tails | 1) = 1/3

P(Heads | 2) = 1/3
P(Tails | 2) = 2/3

Gibbs sampling is an algorithm that can be used to estimate the true joint probability distribution from the conditional distributions. The key part of the algorithm, in Python code, looks like:

for i in range(n):
  for j in range(thin):
    c = pick_coin(s)  # 0 or 1
    s = pick_spinner(c)  # 0, 1, 2
  # increment correct counter

The purpose of the “thin” is to select pairs of values that weren’t generated close together. Function pick_coin() returns random Heads (0) or Tails (1), depending on the value of the spinner. Function pick_spinner() returns random 0 or 1 or 2 depending on the value of the coin.

My Gibbs demo iterates 1,000 times and gives a very close approximation to the true joint distribution.

If you’ve stumbled on this blog post while searching the Internet for an example of Gibbs sampling, I hope you pick up one more piece of the very complex puzzle.


# gibbs_discrete.py

import numpy as np
np.random.seed(0)

def pick_coin(spinner):
  p = np.random.random()  # [0.0, 1.0)
  if spinner == 0:
    if p < 0.250000: return 0
    else: return 1
  elif spinner == 1:
    if p < 0.666667: return 0
    else: return 1
  elif spinner == 2:
    if p < 0.333333: return 0
    else: return 1

def pick_spinner(coin):
  p = np.random.random()  # [0.0, 1.0)
  if coin == 0:
    if p < 0.250000: return 0
    elif p < 0.750000: return 1
    else: return 2
  elif coin == 1:
    if p < 0.500000: return 0
    elif p < 0.666667: return 1
    else: return 2  

def gibbs_sample(n=1000, thin=500):
  ct00 = 0; ct01 = 0; ct02 = 0;
  ct10 = 0; ct11 = 0; ct12 = 0;  
  c=0; s=0
  for i in range(n):
    for j in range(thin):
      c = pick_coin(s)  # 0 or 1
      s = pick_spinner(c)  # 0, 1, 2
    # print(c, s)
    if   c == 0 and s == 0: ct00 += 1
    elif c == 0 and s == 1: ct01 += 1
    elif c == 0 and s == 2: ct02 += 1
    elif c == 1 and s == 0: ct10 += 1
    elif c == 1 and s == 1: ct11 += 1
    elif c == 1 and s == 2: ct12 += 1

  print("P(00) = %0.4f" % (ct00 / n))
  print("P(01) = %0.4f" % (ct01 / n))
  print("P(02) = %0.4f" % (ct02 / n))
  print("P(10) = %0.4f" % (ct10 / n))
  print("P(11) = %0.4f" % (ct11 / n))
  print("P(12) = %0.4f" % (ct12 / n))


print("\nBegin \n")
print("True joint distribution: \n")
print("P(00) = 0.10")
print("P(01) = 0.20")
print("P(02) = 0.10")
print("P(10) = 0.30")
print("P(11) = 0.10")
print("P(12) = 0.20")

print("\nEstimated using Gibbs sampling: \n") 
gibbs_sample(1000, 500)

print("\nDone \n")
Posted in Miscellaneous