Reading the MNIST Data Set with C#

Update: Thanks to Michal Wilczynski who pointed out that in order to read header information you need to deal with endian formatting. See bottom of this post.


The MNIST data set is a well-known collection of image data of handwritten digits (0-9) that is used to benchmark machine learning pattern recognition algorithms. The MNIST data is stored in 4 binary files, which can be awkward to deal with directly so I decided to write a C# program to access the data.

Each digit image is a 28 x 28 pixel set where each pixel value is between 0 (white) and 255 (completely black). Values between 0 and 255 are shades of gray.

There are 60,000 training digits and 10,000 test digits. Each of the two sets is stored in two binary files, one containing the pixel data and the other containing the corresponding label (0-9). The data files are available at http://yann.lecun.com/exdb/mnist/ in gzip form. I installed the free 7-Zip utility to unzip the files (I find WinZip increasingly annoying with their advertising).

The screenshot below shows a snapshot of the program reading the 10,000 test images. The program simulates the image by using a blank space for white, a dot/period for gray, and a ‘O’ for black. The associated label is displayed below the image representation.

The program code is below. The main idea is to define a DigitImage class that has the pixels and the label of one digit. I open both files and read 28 x 28 bytes from the image file and one byte from the label file, and then combine them. Each file has some header ints (4 for the image data and 2 for the label data) that are read and discarded.

MNIST-Digit-Recognition


using System;
using System.IO;

namespace ReadMNIST
{
  class Program
  {
    static void Main(string[] args)
    {
      try
      {
        Console.WriteLine("\nBegin\n");
        FileStream ifsLabels =
         new FileStream(@"C:\t10k-labels.idx1-ubyte",
         FileMode.Open); // test labels
        FileStream ifsImages =
         new FileStream(@"C:\t10k-images.idx3-ubyte",
         FileMode.Open); // test images

        BinaryReader brLabels =
         new BinaryReader(ifsLabels);
        BinaryReader brImages =
         new BinaryReader(ifsImages);
 
        int magic1 = brImages.ReadInt32(); // discard
        int numImages = brImages.ReadInt32(); 
        int numRows = brImages.ReadInt32(); 
        int numCols = brImages.ReadInt32(); 

        int magic2 = brLabels.ReadInt32(); 
        int numLabels = brLabels.ReadInt32(); 

        byte[][] pixels = new byte[28][];
        for (int i = 0; i < pixels.Length; ++i)
          pixels[i] = new byte[28];

        // each test image
        for (int di = 0; di < 10000; ++di) 
        {
          for (int i = 0; i < 28; ++i)
          {
            for (int j = 0; j < 28; ++j)
            {
              byte b = brImages.ReadByte();
              pixels[i][j] = b;
            }
          }

          byte lbl = brLabels.ReadByte();

          DigitImage dImage =
            new DigitImage(pixels, lbl);
          Console.WriteLine(dImage.ToString());
          Console.ReadLine();
        } // each image

        ifsImages.Close();
        brImages.Close();
        ifsLabels.Close();
        brLabels.Close();

        Console.WriteLine("\nEnd\n");
        Console.ReadLine();
      }
      catch (Exception ex)
      {
        Console.WriteLine(ex.Message);
        Console.ReadLine();
      }
    } // Main
  } // Program

  public class DigitImage
  {
    public byte[][] pixels;
    public byte label;

    public DigitImage(byte[][] pixels,
      byte label)
    {
      this.pixels = new byte[28][];
      for (int i = 0; i < this.pixels.Length; ++i)
        this.pixels[i] = new byte[28];

      for (int i = 0; i < 28; ++i)
        for (int j = 0; j < 28; ++j)
          this.pixels[i][j] = pixels[i][j];

      this.label = label;
    }

    public override string ToString()
    {
      string s = "";
      for (int i = 0; i < 28; ++i)
      {
        for (int j = 0; j < 28; ++j)
        {
          if (this.pixels[i][j] == 0)
            s += " "; // white
          else if (this.pixels[i][j] == 255)
            s += "O"; // black
          else
            s += "."; // gray
        }
        s += "\n";
      }
      s += this.label.ToString();
      return s;
    } // ToString

  }
} // ns



Michal Wilczynski writes:

I’ve also wanted to point a little bug in code, that was driving me nuts today.

All of the Int32 (ie. header/metadata of dataset) parsed from MNIST datasets are in high endian format, ie. you need to reverse them if you’re on little endian processor PC (like most of us are nowadays, ie. Intel processors).

Possible snippet on doing so by Hans Passant (from https://stackoverflow.com/questions/20967088/what-did-i-do-wrong-with-binaryreader-in-c):

public static int ReadBigInt32(this BinaryReader br)
{
  var bytes = br.ReadBytes(sizeof(Int32));
  if (BitConverter.IsLittleEndian)
    Array.Reverse(bytes);
  return BitConverter.ToInt32(bytes, 0);
}

Might help people a lot, who (like me) got stuck on such tiny mistake (that did not change the rest of dataset but could make people doubt their parsing methods).

Regards,
Michał

This entry was posted in Machine Learning, Software Test Automation. Bookmark the permalink.