Machine Learning using C# Succinctly

A book I wrote has just been published — “Machine Learning using C# Succinctly”. See The publisher is Syncfusion Press. They’re an interesting company because they publish technical books and make them freely available, if you register your e-mail address. This allows the company to send you e-mail messages, but they’re pretty reasonable about it (I investigated before I committed to writing the book).


In the book I explain in detail how to code basic machine learning systems using the C# programming language. The table of contents is:

Chapter 1 – K-Means Clustering
Chapter 2 – Categorical Data Clustering
Chapter 3 – Logistic Regression Classification
Chapter 4 – Naive Bayes Classification
Chapter 5 – Neural Network Classification

These five topics are more-or-less the “Hello World” (fundamental) techniques in machine learning. By the way, there’s no consensus agreement on exactly what machine learning means. In my mind, machine learning is any system that uses data to make predictions.

If you’re a developer who uses the Microsoft technologies stack, you might want to take a look at the book. As I mentioned, it’s available in PDF form as a free download. Hard to beat that price.

Posted in Machine Learning | Leave a comment

A PGM Image Viewer using C#

I’ve been playing with image recognition lately, an area I am not very familiar with. Image recognition led me to Gaussian kernels and image distortion. And these topics led me to image viewing. For fun I wrote an image viewer program (see below) that displays PGM images.


The PGM format is one of the simplest image formats and is one I’d never heard of until doing my image research. PGM stands for Portable Gray Map. There are actually several variations of PGM files but the most common form is a binary file of pixel values, with some header lines.

For example, the screenshot above is my viewer program displaying file coins.pgm which contains:

# coins.pgm
300 246
49 50 48 . . . (a total of 300*246 values)

The P5 is called a “magic number” (even though it’s a string) that identifies the file type. Lines that start with ‘#’ are comments. The 300 and 246 are the width and height, in pixels. The 255 is the maximum pixel value in the file (0 is black, 255 is white, values in between are shades of gray). The values 49, 50, and so on are the pixel values, from left to right, top to bottom. The pixel values are stored in binary so if you opened file coins.pgm with notepad or Word, you would see strange characters.

The key calling code is:

string file = textBox1.Text;
PgmImage pgmImage = LoadImage(file);
int magnify = int.Parse(textBox2.Text.Trim());
Bitmap bitMap = MakeBitmap(pgmImage, magnify);
pictureBox1.Image = bitMap;

I use a program-defined PgmImage object:

public class PgmImage
  public int width;
  public int height;
  public int maxVal;
  public byte[][] pixels;

  public PgmImage(int width, int height, int maxVal,
    byte[][] pixels)
    this.width = width;
    this.height = height;
    this.maxVal = maxVal;
    this.pixels = pixels;

Method LoadImage reads the target PGM file using the .NET BinaryReader class:

public PgmImage LoadImage(string file)
  FileStream ifs = new FileStream(file, FileMode.Open);
  BinaryReader br = new BinaryReader(ifs);

  string magic = NextNonCommentLine(br);
  if (magic != "P5")
    throw new Exception("Unknown magic number: " + magic);
  listBox1.Items.Add("magic number = " + magic);

  string widthHeight = NextNonCommentLine(br);
  string[] tokens = widthHeight.Split(' ');
  int width = int.Parse(tokens[0]);
  int height = int.Parse(tokens[1]);
  listBox1.Items.Add("width height = " + width + " " + height);

  string sMaxVal = NextNonCommentLine(br);
  int maxVal = int.Parse(sMaxVal);
  listBox1.Items.Add("maxVal = " + maxVal);

  // read width * height pixel values . . .
  byte[][] pixels = new byte[height][];
  for (int i = 0; i < height; ++i)
    pixels[i] = new byte[width];

  for (int i = 0; i < height; ++i)
    for (int j = 0; j < width; ++j)
      pixels[i][j] = br.ReadByte();

  br.Close(); ifs.Close();

  PgmImage result = new PgmImage(width, height, maxVal, pixels);
  listBox1.Items.Add("image loaded");
  return result;

Reading the header lines is done by a helper NextNonCommentLine and its helper NextAnyLine:

static string NextAnyLine(BinaryReader br)
  string s = "";
  byte b = 0; // dummy
  while (b != 10) // newline
    b = br.ReadByte();
    char c = (char)b;
    s += c;
  return s.Trim();

static string NextNonCommentLine(BinaryReader br)
  string s = NextAnyLine(br);
  while (s.StartsWith("#") || s == "")
    s = NextAnyLine(br);
  return s;

Once the PGM image is loaded into the program-defined PgmImage object, a .NET Bitmap object (which is essentially an image) is created. The code is short but not obvious:

static Bitmap MakeBitmap(PgmImage pgmImage, int mag)
  int width = pgmImage.width * mag;
  int height = pgmImage.height * mag;
  Bitmap result = new Bitmap(width, height);
  Graphics gr = Graphics.FromImage(result);
  for (int i = 0; i < pgmImage.height; ++i) {
    for (int j = 0; j < pgmImage.width; ++j) {
      int pixelColor = pgmImage.pixels[i][j];
      Color c = Color.FromArgb(pixelColor, pixelColor, pixelColor);
      SolidBrush sb = new SolidBrush(c);
      gr.FillRectangle(sb, j * mag, i * mag, mag, mag);
  return result;

Once the Bitmap object is created, assigning the object to the Image property of a .NET PictureBox control automatically displays the image.

Posted in Machine Learning | Leave a comment

Neural Network Training using Simplex Optimization

I wrote an article titled “Neural Network Training using Simplex Optimization” in the October 2014 issue of Visual Studio Magazine. See A neural network is like a complicated math equation that has variables and coefficients. Training a neural network is the process of finding the values for the coefficients (which are called weights and biases).


To find good values for weight and biases, you use training data that has known inputs and output values. You want to minimize the error between computed outputs and actual outputs. This is called a numerical optimization problem.

There are about a dozen or so common numerical optimization techniques that can be used to train a neural network. By far the most common technique is called back-propagation. Another technique that is becoming increasingly popular is called particle swarm optimization.

One of the oldest numerical optimization techniques is called simplex optimization. A simplex is a triangle. Simplex optimization uses three candidate solutions. There are many variations of simplex optimization. The most common is called the Nelder-Mead algorithm. My article uses a simpler version of simplex optimization that doesn’t have a particular name.

Simplex optimization is also known as amoeba method optimization. Not because it mimics the behavior of an amoeba, but rather, because if you graph the behavior of the algorithm, which is based on geometry, it looks like a triangle oozing across the screen, vaguely looking like an amoeba.

Posted in Machine Learning | Leave a comment

The Logit Log-Odds Function in Machine Learning

This week I was working with the logit function, also known as the log-odds function. There are plenty of deep math explanations of the logit function, but I think most descriptions miss the main point.

The probability of an event, p, is a number between 0 and 1 that is a measure of how likely the event is. The bottom line is that a logit function result is (almost) a number between -4 and +4 that is a measure of how likely an event is. I say “almost”, because in theory a logit result can be from -infinity to +infinity, but in most situations the result is between about -4 and +4, and in the majority of those situations the result is between -2 and +2.

In other words, probability and logit values describe how likely an event is.

The definition of the logit function is

logit(p) = log(p / (1-p))

Notice that the only real information in the logit function is a probability, so logit cannot supply more information than probability. The p / 1-p term is the odds of an event. For example, if the probability of some event is 0.75, then the odds of the event are 0.75 / (1 – 0.75) = 3 / 1 or “three to one odds”. So logit is just the log of a probability expressed as odds, hence the name log-odds, which was shortened to “logit”.

Here’s what the logit function looks like (the tails go off to infinity):


So, why use the logit function at all? There are two reasons why the logit function might be used. First, because a logit value that is negative is less than 50% likely, and a logit value that is positive is more than 50% likely, logit values are easy to interpret by eye for some problems. The second reason is that, because of properties of the math log function, two logit values can sometimes be easier to compare than the two associated probabilities. I don’t really buy either reason to be honest — I prefer to use probabilities.

Final notes: the logit function is the math inverse of the logistic sigmoid function:

logistic(z) = 1.0 / (1.0 + e^-z)

The logistic sigmoid function has many uses in machine learning. And, the logistic sigmoid function is closely related to tanh, the hyperbolic tangent function, another common ML function, especially with neural networks. The relationship between logistic and tanh is:

tanh(z) = 2 * logistic(2z) – 1

logistic(z) = (tanh(z/2) + 1) / 2

In short, the logit, logistic sigmoid, and tanh functions are all related to each other and are conceptually based on probability.

Posted in Machine Learning

Probit Classification using C#

In machine learning, a classification problem is one where you want to predict something, where the something takes on a class value (such as “died” or “survived”) as opposed to a strictly numeric value (such as blood pressure). The variables used to make the prediction are called the features, or the independent variables. For example, to predict whether or not a hospital patient will die or survive, you might use features age, sex, and kidney-test score.


There are several ML classification techniques, for example, logistic regression classification, neural network classification, decision tree classification, and naive Bayes classification. Different classification techniques tend to be suited to different types of problems.

I wrote an article titled “Probit Classification using C#” in the October 2015 issue of MSDN Magazine. See Probit classification is very similar to logistic regression classification. Probit stands for “probability unit” because the result of probit classification is a number between 0 and 1 which can be interpreted as a probability.

Probit classification isn’t used as often as other classification techniques, except by analysts who work in finance and economics. I believe this is mostly for historical reasons. Probit classification tends to give results that are pretty much the same as logistic regression classification.

Posted in Machine Learning

A Recap of Science Fiction Movies of 2013

The year 2013 is now long passed so I figured I’d review the science fiction films released that year. Here are 10 significant (meaning only that I saw them) sci-fi movies from 2013, with my ratings, from best to worst. I didn’t include super hero movies like Iron Man 3, Man of Steel, Thor: The Dark World, and The Wolverine because they belong in a separate category in my mind.

1. Gravity – Grade = A. Sandra Bullock and George Clooney adrift in space. This movie had me on the edge of my seat from the first scene to the last. No real plot to speak of, and not much character development, so I can understand why a lot of people don’t like this movie so much. But I loved it.


2. Oblivion – Grade = A-. Tom Cruise as one of the few humans on Earth after a war with aliens. I rarely have high expectations for a Tom Cruise vehicle, but I was very pleasantly surprised. Very clever plot. At the end, I found myself saying, in a good way, “Why didn’t I see that coming!?”


3. The Hunger Games: Catching Fire – Grade = B. Jennifer Lawrence competes against previous champions in a fight to the death. I didn’t like the first Hunger Games (2012) movie at all. I was forced to see this film and it was another pleasant surprise. This second Hunger was a lot less trite and clichéd than the first.


4. Star Trek into Darkness – Grade = B. Kirk versus Khan. Again. A third 2013 film that exceeded my expectations. I liked the first Chris Pine as Kirk Star Trek (2009) a lot, but sequels can be iffy propositions, so didn’t know what to expect here. I prefer this sequel to the 2009 film. Good action combined with intelligent plot.


5. Ender’s Game – Grade = B-. Space cadet Ender Wiggin destroys an alien species then saves their last egg. Many of my friends are huge fans of the book and so were rather disappointed with this somewhat lackluster film. Not a bad film, just not a really good film.


6. Pacific Rim – Grade = C. Giant robots are created to battle giant alien monsters emerging from some other dimension or something, on the ocean floor. I liked this much better than I thought I would. A good friend of mine, Ken L., is a often perfect negative indicator for me. Usually, movies that he likes a lot, like Sucker Punch (2011), are not my favorites, and movies I like, typically leave him unimpressed. So, when Ken raved about Pacific Rim, I was wary. But it was much better than I thought it’d be.


7. Elysium – Grade = C. Matt Damon champions the downtrodden on Earth against the elite citizens in space. I like most Matt Damon films. And the story sounded intriguing. But the movie just didn’t do much for me. The entire story seemed a bit illogical and too far-fetched to me, which I know is weird when I can suspend disbelief for other films.


8. Europa Report – Grade = C. Found video footage reveals how a mission to Jupiter’s moon Europa went wrong. Not a bad low-budget little movie. Seemed fairly realistic to me, but the story dragged a bit.


9. Riddick – Grade = C. Arg! Vin Diesel doing a lot of staring and a lot of fighting. I really like Chronicles of Riddick (2004) and had high hopes. But, alas, this part III just didn’t come together for me. Just a little too slow, a little too lame. One of those films where the whole is less than the sum of its parts.


10. After Earth – Grade = F. An incredibly annoying Jaden Smith and a moderately annoying Will Smith roaming randomly in a thoroughly annoying movie. Bad movie. Very bad movie. It’s hard for a movie to be both boring and not make sense because of the action, but this film did. Epic bad.


Posted in Top Ten | 1 Comment

Creating Neural Networks using Azure Machine Learning Studio

Several weeks ago, Microsoft released a new tool and system to create machine learning models. I wrote an article titled “Creating Neural Networks using Azure Machine Learning Studio” in the September 2014 issue of Visual Studio Magazine. See


The system is Cloud-based (on Microsoft Azure). The backend, where computations are performed, is called Microsoft Azure Machine Learning (sometimes abbreviated MAML). The front-end UI part of the system is a Web application called Machine Learning Studio (ML Studio).

In the article, I describe, step by step, how to create a neural network model that predicts the species (either “setosa”, “versicolor”, or “virginica”) of an iris flower, based on four numeric features: sepal length, sepal width, petal length, and petal width. A sepal is a green leaf-like structure.

ML Studio is an almost completely drag and drop system. You drag items that represent either data or actions on data (functions or methods to a programmer) onto a design surface and then connect the modules.

Fig1- IrisExperiment

The graphical approach is much, much faster than creating a prediction model using code. On the downside, the SDK for the system has not yet been released so you can’t write custom modules, meaning you can only do whatever the built-in modules can do. An analogy is Lego. With a lot of Lego modules you can build a lot of cool things. But if you had some machine to design and create custom Lego pieces (like an SDK), you could build anything.

It will be interesting to see if Azure ML gains traction among developers, business analysts, and data scientists. I think Azure ML is very cool, but the technology landscape is littered with the carcasses of great technologies that never caught on because of bad marketing or bad timing or just bad luck.

Posted in Machine Learning