Graphing Ackley’s Function using SciLab

Ackley’s function is a standard benchmark mathematical function used to test numerical optimization algorithms. The function is defined in two dimensions as:

f(x,y) = -20 * exp(-0.2*sqrt(0.5*(x^2 + y^2))) -
           - exp(0.5*cos(2*pi*x) + cos(2*pi*y)))
           + 20 + e

The function has a minimum value of 0.0 located at x = 0, y = 0. I wanted to graph the function so I used SciLab, a free program that is similar to the very expensive MatLab program.

Here are the five SciLab commands I issued:

-->[x,y]=meshgrid(-4:.10:4,-4:.10:4);
-->z=-20 * exp(-0.2*sqrt(0.5 * (x.^2 + y.^2)))
     - exp(0.5 * (cos(2*%pi*x) + cos(2*%pi*y)))
     + 20 + %e;
-->g=scf();
-->g.color_map=jetcolormap(64);
-->surf(x,y,z);

The first command sets up a matrix of (x,y) values from -4 to +4, 0.10 units apart. The second command is the definition of Ackley’s function for two variables. Note the use of the .^ operator rather than the ^ operator. The third through fifth commands create the graph.

AckleysFunctionGraphScreenshot

Notice that Ackley’s function has many local minimum values, which makes finding the optimal solution at (0,0) a bit tricky for some numerical optimization algorithms.

Posted in Machine Learning | Leave a comment

Introduction to Microsoft Machine Learning Studio

I wrote an article titled “Introduction to Machine Learning Studio” in the September 2014 issue of MSDN Magazine. See http://msdn.microsoft.com/en-us/magazine/dn781358.aspx. Actually the title is a bit inaccurate (on purpose). Machine Learning Studio (ML Studio) is a Web application that allows you to create a machine learning system using drag-and-drop modules — no coding at all. ML Studio is essentially the UI for Microsoft Azure Machine Learning, which is a Web service that does all the work.

Figure1-A Complete ML Studio Experiment

In the article, I explain, step-by-step, how to create a system that predicts whether a U.S. Congressman is a Republican or Democrat, based on their voting record. The example uses Logistic Regression, but ML Studio supports several different ML classification techniques, including neural networks.

Overall, I am quite a fan of ML Studio. Using the tool saves a ton of time compared to my normal code-from-scratch approach. On the downside, ML Studio does not yet have an SDK so it’s not currently possible to create custom-coded modules using C# (although ML Studio can execute R language scripts, which is sort of a customization approach).

And, I’m really not a fan of Cloud-based applications. I know they’re the wave of the future and inevitable, but I prefer having my entire system local, sitting in a PC in my office. I get annoyed by Cloud-based systems that are constantly changing — I like a stable platform to work on even if it means I need to install upgrades every now and then.

Posted in Machine Learning | Leave a comment

Winnow Classification using C#

I wrote an article titled “Winnow Classification using C#” in the September 2014 issue of MSDN Magazine. See http://msdn.microsoft.com/en-us/magazine/dn781362.aspx. In machine learning, an example of a classification problem is to predict which class (male or female) a person belongs to, based on features such as occupation, political party, height and so on. There are many kinds of classification techniques, including neural network classifiers, logistic regression classification, and so on.

WinnowDemo

Winnow classification is a relatively little-used technique. Winnow classification applies when the dependent y-variable to predict is binary (two possible values, like sex) and also each of the independent predictor variables is also binary. The example I present in the MSDN article predicts whether a member of the U.S. Congress is a Democrat or a Republican, based on 16 predictor votes which can be “yes” or “no”.

In my opinion, there is no clear evidence that tells whether Winnow Classification is better, worse, or about equal to other classification techniques.

Posted in Machine Learning | Leave a comment

Neural Network Batch and Online Training

I wrote an article titled “Understanding Neural Network Batch Training: A Tutorial” in the August 2014 issue of Visual Studio Magazine. See http://visualstudiomagazine.com/articles/2014/08/01/batch-training.aspx.

BatchTraining

By far the most common technique used to train a neural network is called back-propagation. But there are alternatives including using particle swarm optimization, or using simplex optimization. Each of the possible training techniques can be applied using one of two approaches, usually called “batch” and “online” training.

In batch training, in each iteration of the main training loop, all training data items are examined and their associated errors (differences between the computed output values and the actual output values) are then used to modify the network’s weights and bias values.

In online training, in each iteration of the main training loop, after each test item is fed to the network, weights and bias values are immediately updated.

Research suggest that online training is quite a bit superior to batch training when using back-propagation, but research results are not clear if online training is better when using particle swarm or simplex optimization.

Posted in Machine Learning

Scraping Active Directory E-mail Addresses using PowerShell

In my day-to-day software development activities, I rarely use PowerShell. My programming tasks don’t hit PowerShell’s sweet spot of between simple (when I use a .bat file) and more complex (when I use C#). But for one specific task — finding e-mail addresses on an Active Directory network — a neat PowerShell script is my utility of choice.

ScrapingActiveDiredctoryUsingPowerShell

My PowerShell script is named makeDistList.ps1 and starts with some documentation comments:

# makeDistList.ps1 
# create an e-mail list of certain employees.
# run as C:\> .\makeDistList.ps1 to view on screen
# run as C:\> .\makeDistList.ps1 > mylist.txt to save

# "cn" -- common name, first name & last name
# "sn" -- surname
# "givenName" -- first name
# "name" -- usually same as 'cn' but not always
# "mail" -- e-mail address
# "company" -- like 'Acme'
# "title" -- for some employees only
# "manager" -- employee's manager
# "description" -- misc. info such as '(Sponsor: jLee)'
# "department" -- Cost Center name
# "telephoneNumber" -- office phone number

# set up attribute of AD users to search for
# Examples:
#$target = "(&(objectCategory=User)(Title=*SDET*))"
# users with 'SDET' in their job titles

#$target = "(&(objectCategory=User)(company=*Acme*))"
# users who are contractors through Acme

#$target = "(&(objectCategory=User)(telephoneNumber=*706*))" 
# users whose phone number contains '702'

Next, I set up a wonky .NET DirectorySearcher object:

$root = new-object directoryServices.directoryEntry
$seeker = new-object directoryServices.directorySearcher
$seeker.searchRoot = $root
$seeker.SearchScope = "Subtree"
$seeker.serverTimeLimit = -1
$seeker.clientTimeout = -1
$seeker.filter = $target
$seeker.pageSize = 10

Then I set up which AD fields to retrieve (in this case, user name and e-mail address):

$colProplist = "cn", "mail"
foreach ($i in $colPropList) {
 $seeker.PropertiesToLoad.Add($i) | out-null
}

Next I set up any users I want to exclude in an array:

$omitList = @("jSmith@foo.com", "cLee@foo.com")

Now, invoke and show how many users were found:

$resultsCollection = $seeker.findAll()
write-host $resultsCollection.Count  

The hardest part was figuring out how to extract the results from the return object, and display them:

# create the list of e-mail addresses
$allMail = ""
foreach ($result in $resultsCollection) {           
  $propertiesCollection = $result.Properties  

  $mailValues = $propertiesCollection.mail          
  if ($mailValues -eq $null) {
    write-host "No mail found for: ", $propertiesCollection.cn
  }
  else {
    $mailString = $mailValues.Item(0)
    $addIt = $true

    foreach ($m in $omitList) {
      if ($m -eq $mailString) {
        write-host "NOT adding", $mailString
        $addIt = $false
        break
      }
    }

    # omit special cases
    if ($mailString.Contains("toronto")) {
      $addIt = $false
    }

    if ($addIt -eq $true) {
      
      write-host "Adding", $mailString
      $allMail += $mailString + ";"
    }                 
  } #else                                               
} #foreach $result

The script finishes by adding in special people and displaying the results:

$allMail += ";aBrown@bar.com;bJones@place.com"
write-output $allMail # show results!
write-host "Done"
# end script

I use this PowerShell script, or some variation of it, quite often. For generating e-mail addresses, PowerShell is an excellent tool for me.

Posted in Miscellaneous

Implementing the BFGS Minimization Function using C#

Last weekend, just for fun, I decided I’d take a look at the BFGS (Broyden–Fletcher–Goldfarb–Shanno) numeric minimization algorithm. The algorithm finds the minimum value of a function. To get started, I examined the C language implementation presented in the book “Numerical Recipes in C” (NR). The function is listed section 10.7 and is called dfpsrch (“DFP search) because BFGS is really just a slight variation of the earlier DFP (Davidon-Fletcher-Powell) algorithm. The NR implementation calls helper function lnsrch (“line search”) in section 9.7.

So, I refactored the NR C language code to C#. It was quite interesting in the sense that I had to sharpen up my C skill to understand what was going on. The NR code is, basically, crazy.

BFGS

The BFGS algorithm (not to be confused with the even more complicated L-BFGS algorithm (“limited memory” version) is based on calculus techniques such as function gradients (first derivatives) and the Hessian matrix of second partial derivatives. The most striking thing about BFGS is the number of ways that the function can fail. As it turns out, robust implementations of L-BFGS and BFGS exist but they have a huge amount of special-case checks in the code. My point is that the NR code for BFGS is an OK way to start understanding the algorithm, but, in addition to being copyrighted, the NR code is not suitable for anything but experimentation because the code can fail in dozens of situations.

using System;
namespace BFGSExperiments
{
  class Program
  {
    static void Main(string[] args)
    {
      try
      {
        Console.WriteLine("\nBegin\n");

        int iter;
        double fret;
        double[] p = new double[] {1, 1};
        Minimize(p, 2, 0.0001, out iter, out fret,
          FunctionToMinimize, GradientOfFunction);

        Console.WriteLine("Iterations = " + iter);
        Console.WriteLine("Function min value = " +
          fret);
        Console.WriteLine("Minimized at " + p[0] +
          " " + p[1]);

        Console.WriteLine("\nEnd\n");
        Console.ReadLine();
      }
      catch (Exception ex)
      {
        Console.WriteLine(ex.Message);
        Console.ReadLine();
      }


    } // Main

    public delegate double SomeFunction(double[] fvals);
    public delegate void GradFunction(double[] x,
      double[] grads);

    // -----------------------------------------------------------

    public static double FunctionToMinimize(double[] x)
    {
      return -Math.Exp(-Math.Pow(x[0] - 1, 2)) -
        Math.Exp(-0.5 * Math.Pow(x[1] - 2, 2));
      // has min at x = 1, y = 2
    }

    public static void GradientOfFunction(double[] x,
      double[] grads)
    {
      grads[0] = 2 * Math.Exp(-Math.Pow(x[0] - 1, 2)) *
        (x[0] - 1);
      grads[1] = Math.Exp(-0.5 * Math.Pow(x[1] - 2, 2)) *
        (x[1] - 2);
    }

    // -----------------------------------------------------------

    public static void Minimize(double[] p, int n,
      double gtol, out int iter, out double fret,
      SomeFunction func, GradFunction dfunc)
    {
      // aka dfpmin, aka BFGS
      // starting point p[] of length n, minimize func, 
      // using its gradient gfunc
      // returns are p[] (location of min value),
      // iter (number iterations performed),
      // fret (min value of function at p[])
      const int ITMAX = 200;
      const double EPS = 3.0e-8;
      const double TOLX = 4.0 * EPS;
      const double STPMX = 100.0;

      int check, i, its, j;
      double den, fac, fad, fae, fp, stpmax, sum,
        sumdg, sumxi, temp, test;
      double[] dg, g, hdg, pnew, xi;
      double[][] hessin;

      iter = 0;
      sum = 0.0; // keep compiler happy

      dg = new double[n];
      g = new double[n];
      hdg = new double[n];
      hessin = MakeMatrix(n, n);
      pnew = new double[n];
      xi = new double[n];
      fp = func(p);
      Console.WriteLine("starting fp = " + fp);
      dfunc(p, g);

      Console.WriteLine("starting Grads: g[0] = " +
        g[0] + " g[1] = " + g[1]);
      Console.ReadLine();

      for (i = 0; i < n; ++i)  {
        for (j = 0; j < n; ++j) hessin[i][j] = 0.0;
        hessin[i][i] = 1.0;
        xi[i] = -g[i];
        sum += p[i] * p[i];
      }
      
      stpmax = STPMX * Math.Max(Math.Sqrt(sum),
        (double)n);
      for (its = 1; its <= ITMAX; ++its) // main loop
      {
        iter = its;
        LineSearch(n, p, fp, g, xi, pnew, out fret, 
          stpmax, out check, func);

        fp = fret;
        Console.WriteLine("fp in loop = " + fp);
        for (i = 0; i < n; ++i)  {
          xi[i] = pnew[i] - p[i];
          p[i] = pnew[i];
        }
        Console.WriteLine("New p0 p1 = " +
          p[0] + " " + p[1]);
        Console.ReadLine();
        test = 0.0;
        for (i = 0; i  test) test = temp;
        }

        if (test < TOLX) {
          Console.WriteLine("Exiting when test = " +
            test + " < tolx = " + TOLX);
          return;
        }

        for (i = 0; i < n; ++i) dg[i] = g[i];
        dfunc(p, g);
        test = 0.0;
        den = Math.Max(fret, 1.0);
        for (i = 0; i  test) test = temp;
        }
        if (test < gtol)  {
          Console.WriteLine("Exiting when test = " +
            test + " < gtol = " + gtol);
          return;
        }

        for (i = 0; i < n; ++i) dg[i] = g[i] - dg[i];
        for (i = 0; i < n; ++i) {
          hdg[i] = 0.0;
          for (j = 0; j < n; ++j)
            hdg[i] += hessin[i][j] * dg[j];
        }
        fac = fae = sumdg = sumxi = 0.0;
        for (i = 0; i  Math.Sqrt(EPS * sumdg * sumxi))
        {
          fac = 1.0 / fac;
          fad = 1.0 / fae;
          for (i = 0; i < n; ++i)
            dg[i] = fac * xi[i] - fad * hdg[i];
          for (i = 0; i < n; ++i) {
            for (j = 0; j < n; ++j) {
              hessin[i][j] += fac * xi[i] * xi[j]
                - fad * hdg[i] * hdg[j] +
                  fae * dg[i] * dg[j];
              hessin[j][i] = hessin[i][j];
            }
          }
        }

        for (i = 0; i < n; ++i) {
          xi[i] = 0.0;
          for (j = 0; j < n; ++j)
            xi[i] -= hessin[i][j] * g[j];
        }


      } // main loop
      throw new Exception("Too many iterations " + iter +
        " in Minimize");

    } // Minimize

    public static double[][] MakeMatrix(int rows, int cols)
    {
      double[][] result = new double[rows][];
      for (int i = 0; i < rows; ++i)
        result[i] = new double[cols];
      return result;
    }

    public static void LineSearch(int n, double[] xold,
      double fold, double[] g, double[] p, double[] x,
      out double f, double stpmax, out int check,
       SomeFunction func)
    {
      // aka lnsrch

      const double ALF = 1.0e-4;
      const double TOLX = 1.0e-7;

      int i;
      double a, alam, alam2, alamin, b, disc, f2, rhs1,
      rhs2, slope, sum, temp, test, tmplam;

      check = 0;
      for (sum = 0, i = 0; i  stpmax)
        for (i = 0; i < n; ++i)
          p[i] *= stpmax / sum;
 
      for (slope = 0.0, i = 0; i = 0.0)
        throw new Exception("Roundoff problem LineSearch");

      test = 0.0;
      for (i = 0; i  test) test = temp;
      }
      alamin = TOLX / test;
      alam = 1.0;
      
      int sanityCt = 0;
      int maxSanity = 1000000;
      f = 0.0; // keeps compiler happy
      f2 = 0.0;
      alam2 = 0.0;
      while (true && sanityCt < maxSanity)
      {
        ++sanityCt;

        for (i = 0; i < n; ++i)
          x[i] = xold[i] + alam * p[i];
        f = func(x);
        if (alam < alamin) {
          for (i = 0; i < n; ++i)
            x[i] = xold[i];
          check = 1;
          return;
        } else if (f <= fold + ALF * alam * slope)
            return;
        else  {
          if (alam == 1.0)
            tmplam = -slope / (2.0 * (f - fold - slope));
          else  {
            rhs1 = f - fold - alam * slope;
            rhs2 = f2 - fold - alam2 * slope;
            a = (rhs1 / (alam * alam) - rhs2 /
              (alam2 * alam2)) / (alam - alam2);
            b = (-alam2 * rhs1 / (alam * alam) + alam *
              rhs2 / (alam2 * alam2)) /
              (alam - alam2);
            if (a == 0.0) tmplam = -slope / (2.0 * b);
            else  {
              disc = b * b - 3.0 * a * slope;
              if (disc < 0.0) tmplam = 0.5 * alam;
              else if (b  0.5 * alam)
              tmplam = 0.5 * alam;
          } // else

        } // else
        alam2 = alam;
        f2 = f;
        alam = Math.Max(tmplam, 0.1 * alam);
      } // while

      if (sanityCt == maxSanity)
        throw new Exception("Insane in LineSearch");
    } // LineSearch
  } // Program
} // ns
Posted in Machine Learning

World Chess Champion Performances at the Chess Olympiads

The 2014 Chess Olympiad was held from August 1-14 in Tromso, Norway. Many people, including me, were surprised that the reigning world champion, Magnus Carlsen, lost two games (to Arkadij Naiditisch of Germany in round 7, and Ivan Saric of Croatia in round 10). This got me wondering about other world champions who have played in Olympiads. So, I parsed through online databases and Wikipedia articles to collect the information below.

Olympiad  Champion   Score
=====================================
1930      Alekhine   9/9       = 100%
1931      Alekhine   13.5/18   = 75%
1933      Alekhine   9.5/12    = 79%
1935      Alekhine   12/17     = 71%
1937      Euwe       9.5/13    = 73%
1939      Alekhine   7.5/10    = 75%
1954      Botvinnik  8.5/11    = 77%
1956      Botvinnik  9.5/13    = 73%
1958      Smyslov    9.5/12    = 79%
1960      Tal        11/15     = 73%
1962      Botvinnik  8/12      = 67%
1964      Petrosian  9.5/13    = 73%
1966      Petrosian  11.5/13   = 88%
1968      Petrosian  10.5/12   = 88%
1970      Spassky    9.5/12    = 79%
1980      Karpov     9/12      = 75%
1982      Karpov     6.5/8     = 81%
1986      Kasparov   8.5/11    = 77%
1988      Kasparov   8.5/10    = 85%
1992      Kasparov   8.5/10    = 85%
1994      Kasparov   6.5/10    = 65%
1996      Kasparov   7/9       = 78%
2014      Carlsen    6/9       = 67%

According to my research, the reigning world champion did not play in the Olympiads for 1927 (Capablanca), 1928, 1950, 1952, 1972 (Fisher), 1974, 1976, 1978, 1984, 1990, 1998, 2000, 2002 (Kramnik), 2004, 2006 (Topalov), 2008 (Anand), 2010, and 2012.

I likely made a few mistakes because the data had to be gathered from several sources. For the years when there were dual world champions, notably when Kasparov split from FIDE, I picked the champion I decided was the more legitimate.

So, the data more or less speaks for itself as they say. It’s hard to compare these results because the number of teams and competition strength has varied over time. But that said, I’d say all the world champions’ performances were extremely good at Olympiads. I am a big Magnus Carlsen fan, but his 2014 Olympiad score of 67% is tied for the second worst on record, although he’s in good company: Botvinnik in 1962 (65%) and Kasparov in 1994 (67%). To Carlsen’s great credit, he played in the 2014 Olympiad, with really nothing to gain, unlike previous champion Anand who did not play in any of the three Olympiads held while he was champion.

CarlsenAndKasparov

Posted in Miscellaneous