Neural Networks using Python

I wrote an article titled “Use Python with your Neural Networks” in the November 2014 issue of Visual Studio Magazine. See Although there are several Python implementations of neural networks available, sometimes writing your own code from scratch has advantages. You fully understand the code, you can customize it to meet your needs exactly, and you can keep the code simple (for example, by removing some error checks).


Python is an interesting language. According to several sources, Python is currently one of the ten most common programming languages, and it’s making gains in education (thank goodness — it’s well past time for Java to go away from education). If you work mostly with languages other than Python, a good side effect of exploring a neural network implemented using Python is that the process serves as a very nice overview of the language.

Posted in Machine Learning | 1 Comment

Recap of the 2014 DevIntersection Conference

I spoke at the 2014 DevIntersection conference, which ran from Sunday, November 9 through Thursday, November 13. The conference was at the MGM Grand hotel in Las Vegas. The event is for software developers who use the Microsoft technology stack, and for SQL Server people. My talk was “ML Studio: Microsoft’s New Machine Learning Tool for Developers”.


My talk had two main parts. In the first half, I explained exactly what neural networks are and why they’re one of the two or three primary machine learning tools. In the second half of my talk, I showed how to create a neural network using ML Studio, a drag and drop tool that doesn’t require coding.

I had a good time at DevIntersection and all of the attendees I talked to seemed to be having a good time too. I estimate there were about 1600 people there. In addition to the approximately 175 talks, there was an Expo where about 30 or 40 tech companies set up booths. Microsoft had three Code Heroes. Interesting. In the picture below, you should be able to figure out who are the Code Heroes and which person is me . . .


Every time I go to speak at a conference, I ask myself about the value. These tech conferences are pretty pricey, but in my opinion, they’re well worth the money. The benefit will vary from person to person. For most attendees, the main value is in practical techniques they learn. For me, the value of tech conferences comes mostly from getting re-energized and getting new ideas. I’m looking forward to next year’s event.

Posted in Machine Learning | Leave a comment

Consensus Classification using C#

I wrote an article titled “Consensus Classification using C#” in the November 2014 issue of Microsoft MSDN Magazine. See Classification and its partner, prediction, are the most fundamental forms of machine learning.

A classification problem is one where you create a system that can predict a value that can be one of two or more discrete classes. For example, you might want to predict whether a person is a political Democrat or a Republican, based on the person’s previous voting record.


There are many different machine learning classification techniques, including neural network classification, logistic regression classification, and decision tree classification. In the MSDN magazine article I present a classification technique that isn’t a standard one.

The idea of consensus classification is to take existing data, with known input and output values, and then instead of creating one very complex rule to determine the output (as in neural networks and logistic regression), generate many very short and simple rules. To predict the output for new input data, the system uses all the applicable simple rules and then the final prediction is the consensus.

For example, suppose there are 100 rules similar to, “if person voted yes on issue #3 and no on issue #12 then person is a Democrat.” Then, for some voting record, if 60 of the simple rules predict Republican, and 30 of the rules predict Democrat, and 10 of the rules aren’t relevant, the final prediction is Republican because that the consensus (majority) opinion.

The consensus classification system I present is an example of what is sometimes called “ensemble learning”. I note in the article that there is no single best classification/prediction technique. Consensus classification has worked well for me in some rare situations where standard techniques just aren’t very effective.

Posted in Machine Learning | Leave a comment

Precision and Recall with Binary Classification

In machine learning, a binary classification problem is one where you are trying to predict something that can be one of two values. For example, suppose you are trying to predict if a baseball team will win (the “+” result) or lose (the “-” result). There are many ways to do binary classification. Probably the most basic technique is called logistic regression classification.


So you create some binary classifier and then use it to make predictions for historical data where you know the actual results. Suppose there are 100 games in your test set. There are four possible outcomes:

You predict team will win and they do win.
You predict team will win but they lose.
You predict team will lose and they do lose.
You predict team will lose but they win.

In general, the four outcomes are called True Positive (you predict + and are correct), False Positive (you predict + but are incorrect), True Negative (you predict – and are correct), and False Negative (you predict – but are incorrect).

Suppose that for the 100 games, your results are:

True Positive (TP) = 40 (correctly predicted a win)
False Positive (FP) = 20 (incorrectly predicted a win)
True Negative (TN) = 30 (correctly predicted a loss)
False Negative (FN) = 10 (incorrectly predicted a loss)

If you put the data above into a 2×2 table, it’s called a “confusion matrix”.

The most fundamental way to evaluate your binary classification model is to compute your accuracy. Here you were correct a total of 40 + 30 = 70 times out of 100 so the model’s accuracy is 0.70. Pretty good.

A fancier way to evaluate the model is to compute “precision” and “recall”. Precision and recall are defined:

Precision = TP / (TP+FP) = 40 / (40+20) = 40/60 = 0.67

Recall = TP / (TP+FN) = 40 / (40+10) = 40/50 = 0.80

Precision and recall are both similar to accuracy, but both are very difficult to understand conceptually. Precision is sort of like accuracy but it looks only at the data you predicted positive (in this example you’re only looking at data where you predict a win). Recall is also sort of like accuracy but it looks only at the data that is “relevant” in some way.

I go crazy trying to understand the deep meaning of precision and recall, and much prefer to just think of them as two numbers that measure the quality of a binary classification model.

Now any ML binary classifier has one or more parameters that you can adjust, which will create a different resulting model. In the case of a logistic regression classifier, you can adjust something called the threshold, which is an internal number between 0 and 1 that determines whether a prediction is positive or not. As you increase the threshold value above 0.5, it becomes more difficult for a data item to be classified as positive.

So in the example of predicting whether the baseball team will win (so you can bet on them), if you use a high threshold, like 0.75, then you won’t get as many “win” predictions as you would with a lower threshold value, but with the higher threshold you’ll be more likely to win you bet when the classifier predicts a win. In other words there’s a tradeoff between getting lots of betting opportunities with a moderate probability of winning, and getting fewer betting opportunities but with a higher probability of winning.

If you change a binary classifier parameter (the threshold for a logistic regression classifier), it turns out the precision and recall will change. But if the precision increases (your chance of winning your bet), the recall (your number of betting opportunities) will decrease. And vice versa.

For logistic regression classification, every value of the threshold will give you a precision value and a recall value. If you graph these points (with precision on the y-axis and recall on the x-axis), you get a precision-recall curve (or equivalently, a precision-recall graph). It would look something like the graph at the top of this post.

Each point on the precision-recall curve corresponds to a value of the threshold of the model. Unfortunately, precision-recall graphs usually don’t label each point with the corresponding value of the model parameter, even though they should.

Every problem will have different priorities and you have to adjust the threshold (or whatever parameters you’re using in you binary classifier) to get higher precision or recall, at the expense of the other factor.

(Note: Thanks to Richard Hughes who pointed out a math error in an earlier blog post on this topic.)

Posted in Machine Learning

Classes in PowerShell v5 Scripts

PowerShell is a Microsoft scripting language used primarily by IT guys. I recently downloaded the pre-release version of PowerShell v5. It’s part of the Windows Management Framework v5. As a developer, the one new feature in v5 I was most interested in is the ability to create a class (data and functions) in a PowerShell script.

Update note: I was using the September 2014 Preview and discovered that there’s still quite of bit of work to be done on PowerShell classes. In particular, class methods cannot return arrays yet. The PowerShell team says that feature should be added in the November 2014 release.


Here’s the code for a quick demo:

write-host "`nBegin demo of PowerShell v5 classes"

class Person {

 [void] SetLast([string]$ln)
  $lastName = $ln

 [string] ToString()
  return $firstName + " " + $lastName

write-host "`nCreating Person object Thomas Smith"
[Person]$p = [Person]::new()
$p.lastName = "Smith"
$p.firstName = "Thomas"

write-host "`nLast name is: "
write-host $p.lastName

write-host "`nChanging last name to 'Law'"

write-host "`nPerson full name is: "
[string]$fullName = $p.ToString()
write-host $fullName

write-host "`nEnd demo"

The demo script defines a class named Person. The class has two data members, lastName and firstName. The class is created with:

[Person]$p = [Person]::new()

Class method SetLast can give a value to the last name like so:


Because class data is fully accessible, giving or changing the last name could also have been done directly by:

$p.lastName = “Law”

Anyway, the ability to use program-defined classes in a PowerShell v5 script is a very nice addition to the language. I intend to see if I can refactor some of my C# machine learning code to PowerShell.

Posted in Machine Learning

Machine Learning using C# Succinctly

A book I wrote has just been published — “Machine Learning using C# Succinctly”. See The publisher is Syncfusion Press. They’re an interesting company because they publish technical books and make them freely available, if you register your e-mail address. This allows the company to send you e-mail messages, but they’re pretty reasonable about it (I investigated before I committed to writing the book).


In the book I explain in detail how to code basic machine learning systems using the C# programming language. The table of contents is:

Chapter 1 – K-Means Clustering
Chapter 2 – Categorical Data Clustering
Chapter 3 – Logistic Regression Classification
Chapter 4 – Naive Bayes Classification
Chapter 5 – Neural Network Classification

These five topics are more-or-less the “Hello World” (fundamental) techniques in machine learning. By the way, there’s no consensus agreement on exactly what machine learning means. In my mind, machine learning is any system that uses data to make predictions.

If you’re a developer who uses the Microsoft technologies stack, you might want to take a look at the book. As I mentioned, it’s available in PDF form as a free download. Hard to beat that price.

Posted in Machine Learning

A PGM Image Viewer using C#

I’ve been playing with image recognition lately, an area I am not very familiar with. Image recognition led me to Gaussian kernels and image distortion. And these topics led me to image viewing. For fun I wrote an image viewer program (see below) that displays PGM images.


The PGM format is one of the simplest image formats and is one I’d never heard of until doing my image research. PGM stands for Portable Gray Map. There are actually several variations of PGM files but the most common form is a binary file of pixel values, with some header lines.

For example, the screenshot above is my viewer program displaying file coins.pgm which contains:

# coins.pgm
300 246
49 50 48 . . . (a total of 300*246 values)

The P5 is called a “magic number” (even though it’s a string) that identifies the file type. Lines that start with ‘#’ are comments. The 300 and 246 are the width and height, in pixels. The 255 is the maximum pixel value in the file (0 is black, 255 is white, values in between are shades of gray). The values 49, 50, and so on are the pixel values, from left to right, top to bottom. The pixel values are stored in binary so if you opened file coins.pgm with notepad or Word, you would see strange characters.

The key calling code is:

string file = textBox1.Text;
PgmImage pgmImage = LoadImage(file);
int magnify = int.Parse(textBox2.Text.Trim());
Bitmap bitMap = MakeBitmap(pgmImage, magnify);
pictureBox1.Image = bitMap;

I use a program-defined PgmImage object:

public class PgmImage
  public int width;
  public int height;
  public int maxVal;
  public byte[][] pixels;

  public PgmImage(int width, int height, int maxVal,
    byte[][] pixels)
    this.width = width;
    this.height = height;
    this.maxVal = maxVal;
    this.pixels = pixels;

Method LoadImage reads the target PGM file using the .NET BinaryReader class:

public PgmImage LoadImage(string file)
  FileStream ifs = new FileStream(file, FileMode.Open);
  BinaryReader br = new BinaryReader(ifs);

  string magic = NextNonCommentLine(br);
  if (magic != "P5")
    throw new Exception("Unknown magic number: " + magic);
  listBox1.Items.Add("magic number = " + magic);

  string widthHeight = NextNonCommentLine(br);
  string[] tokens = widthHeight.Split(' ');
  int width = int.Parse(tokens[0]);
  int height = int.Parse(tokens[1]);
  listBox1.Items.Add("width height = " + width + " " + height);

  string sMaxVal = NextNonCommentLine(br);
  int maxVal = int.Parse(sMaxVal);
  listBox1.Items.Add("maxVal = " + maxVal);

  // read width * height pixel values . . .
  byte[][] pixels = new byte[height][];
  for (int i = 0; i < height; ++i)
    pixels[i] = new byte[width];

  for (int i = 0; i < height; ++i)
    for (int j = 0; j < width; ++j)
      pixels[i][j] = br.ReadByte();

  br.Close(); ifs.Close();

  PgmImage result = new PgmImage(width, height, maxVal, pixels);
  listBox1.Items.Add("image loaded");
  return result;

Reading the header lines is done by a helper NextNonCommentLine and its helper NextAnyLine:

static string NextAnyLine(BinaryReader br)
  string s = "";
  byte b = 0; // dummy
  while (b != 10) // newline
    b = br.ReadByte();
    char c = (char)b;
    s += c;
  return s.Trim();

static string NextNonCommentLine(BinaryReader br)
  string s = NextAnyLine(br);
  while (s.StartsWith("#") || s == "")
    s = NextAnyLine(br);
  return s;

Once the PGM image is loaded into the program-defined PgmImage object, a .NET Bitmap object (which is essentially an image) is created. The code is short but not obvious:

static Bitmap MakeBitmap(PgmImage pgmImage, int mag)
  int width = pgmImage.width * mag;
  int height = pgmImage.height * mag;
  Bitmap result = new Bitmap(width, height);
  Graphics gr = Graphics.FromImage(result);
  for (int i = 0; i < pgmImage.height; ++i) {
    for (int j = 0; j < pgmImage.width; ++j) {
      int pixelColor = pgmImage.pixels[i][j];
      Color c = Color.FromArgb(pixelColor, pixelColor, pixelColor);
      SolidBrush sb = new SolidBrush(c);
      gr.FillRectangle(sb, j * mag, i * mag, mag, mag);
  return result;

Once the Bitmap object is created, assigning the object to the Image property of a .NET PictureBox control automatically displays the image.

Posted in Machine Learning