Linear Ridge Regression from Scratch Using C#

One Sunday morning, just for fun, I decided to refactor one of my C# language linear ridge regression programs. My goal was to use an OOP and API design that resembles the scikit library LinearRegression module.

First, I created some synthetic data using a 3-10-1 PyTorch neural network with random weights. Each of the three predictor variables is between -1.0 and +1.0, and each target y value is between 0 and 1. I made 40 training items and 10 test items. The synthetic data looks like:

-0.2065,  0.0776, -0.1616, 0.3096
 0.7562, -0.9452,  0.3409, 0.2839
-0.1654,  0.1174, -0.7192, 0.4079
 0.9365, -0.3732,  0.3846, 0.2787
. . . 

The first three values on each line are the predictors, the last value is the target.

I set up the coefficient vector so that the intercept/constant/bias is at [0] and the coefficients/weights are at [1], [2], [3].

The most interesting part of the exercise was the Train() method:

public void Train(double[][] trainX, double[] trainY)
{
  double[][] DX = 
    Utils.MatMakeDesign(trainX);  // add 1s column

  // coeffs = inv(DXt * DX) * DXt * Y 
  double[][] a = Utils.MatTranspose(DX);
  double[][] b = Utils.MatProduct(a, DX);

  for (int i = 0; i < b.Length; ++i)  // ridge regression
    b[i][i] += this.alpha;

  double[][] c = Utils.MatInverse(b); 
  double[][] d = Utils.MatProduct(c, a);
  double[][] Y = Utils.VecToMat(trainY, DX.Length, 1);
  double[][] e = Utils.MatProduct(d, Y);
  this.coeffs = Utils.MatToVec(e);
}

The trainX matrix is size (40,3). It is used to create a design matrix — a column of 1s added to act as dummy input values for the constant/intercept term. The closed form solution for optimal constant and coefficients is coeffs = inv(DXt * DX) * DXt * Y where DXt is the transpose of the design matrix and Y is is target values. In order to use matrix multiplication, the 1-D trainY vector had to be transformed into a 2-D matrix.

My demo code adds a small alpha value to the diagonal of the Dxt * DX matrix before inverting. This is called ridge regression. It simultaneously prevents errors when inverting (“conditions” the matrix) and also deters coefficients from becoming very large which leads to model overfitting (L2 regularization).

Brief derivation of the closed form solution of the coefficients. Here DX is the design matrix.


1. y = DX * w  | X is not square, no inverse.

2. DXt * y = (DXt * DX) * w  | pre-mult DXt.

3. inv(DXt * DX) * DXt * y = inv(DXt * DX) * (DXt * DX) * w 

4. inv(DXt * DX) * DXt * y = w  | because inv(A) * A = I.

Note that if inv(DXt * DX) doesn’t exist, which is possible for several reasons, this version of closed form solution fails. Adding alpha to the matrix diagonal decreases the likelihood of this happening.

A lot of fun!



Linear regression is relatively crude because it assumes that the data lies on a straight line (or hyperplane). But linear regression has surprising power. Left: Roberto Reif is one of several people who have implemented algorithms that take a photo image, and then compute how to draw straight lines from edge to edge so that the intersections of the lines correspond to dark pixels in the source image. Right: Artist Petros Vrellis uses a computer program to compute straight line drawings and then projects the results onto a wall and then uses black thread to implement the image by hand.


Demo program. Replace “lt”, “gt”, “lte”, “gte” with Boolean operator symbols.

using System;
using System.IO;

namespace LinearRidgeRegressNumeric
{
  internal class LinearRidgeRegressNumericProgram
  {
    static void Main(string[] args)
    {
      Console.WriteLine("\nLinear ridge regression demo ");

      // 1. load data from file to memory
      Console.WriteLine("\nLoading train and test data ");
      string trainFile = 
        "..\\..\\..\\Data\\synthetic_train.txt";
      double[][] trainX =
        Utils.MatLoad(trainFile,
        new int[] { 0, 1, 2 }, ',', "#");
      double[] trainY =
        Utils.MatToVec(Utils.MatLoad(trainFile,
        new int[] { 3 }, ',', "#"));

      string testFile = 
        "..\\..\\..\\Data\\synthetic_test.txt";
      double[][] testX =
        Utils.MatLoad(testFile,
        new int[] { 0, 1, 2 }, ',', "#");
      double[] testY =
        Utils.MatToVec(Utils.MatLoad(testFile,
        new int[] { 3 }, ',', "#"));

      Console.WriteLine("\nFirst three train X inputs: ");
      for (int i = 0; i "lt" 3; ++i)
        Utils.VecShow(trainX[i], 4, 9, true);

      Console.WriteLine("\nFirst three train y targets: ");
      for (int i = 0; i "lt" 3; ++i)
        Console.WriteLine(trainY[i].ToString("F4"));

      // 2. create LR model
      double alpha = 0.05;
      Console.WriteLine("\nCreating and training LRR model" +
        " with alpha = " + alpha.ToString("F3"));
      Console.WriteLine("Coefficients = " +
        " inv(DXt * DX) * DXt * Y ");
      LinearRidgeRegress lrrModel = 
        new LinearRidgeRegress(alpha);

      // 3. train model
      lrrModel.Train(trainX, trainY);
      Console.WriteLine("Done. Model constant," +
        " coefficients = ");
      Utils.VecShow(lrrModel.coeffs, 4, 9, true);

      // 4. evaluate model
      Console.WriteLine("\nComputing model accuracy ");
      double accTrain = 
        Accuracy(lrrModel, trainX, trainY, 0.10);
      Console.WriteLine("\nAccuracy on train (0.10) = "
        + accTrain.ToString("F4"));

      double accTest = 
        Accuracy(lrrModel, testX, testY, 0.10);
      Console.WriteLine("Accuracy on test (0.10) = " +
        accTest.ToString("F4"));

      double rmseTrain =
        RootMSE(lrrModel, trainX, trainY);
      Console.WriteLine("\nRMSE on train = "
        + rmseTrain.ToString("F4"));

      double rmseTest =
        RootMSE(lrrModel, testX, testY);
      Console.WriteLine("RMSE on test = " +
        rmseTest.ToString("F4"));

      // 5. use model
      Console.WriteLine("\nPredicting x = 0.5, -0.6, 0.7 ");
      double[] x = new double[] { 0.5, -0.6, 0.7 };
      double predY = lrrModel.Predict(x);
      Console.WriteLine("\nPredicted y = " +
        predY.ToString("F4"));

      // 6. TODO: save model to file

      Console.WriteLine("\nEnd linear ridge regression ");
      Console.ReadLine();
    } // Main

    static double Accuracy(LinearRidgeRegress model,
      double[][] dataX, double[] dataY, double pctClose)
    {
      int numCorrect = 0; int numWrong = 0;
      int n = dataX.Length;
      for (int i = 0; i "lt" n; ++i)
      {
        double[] x = dataX[i];
        double yActual = dataY[i];
        double yPred = model.Predict(x);
        if (Math.Abs(yActual - yPred) "lt"
          Math.Abs(pctClose * yActual))
          ++numCorrect;
        else
          ++numWrong;
      }
      return (numCorrect * 1.0) / n;
    }

    static double RootMSE(LinearRidgeRegress model,
      double[][] dataX, double[] dataY)
    {
      double sum = 0.0;
      int n = dataX.Length;
      for (int i = 0; i "lt" n; ++i)
      {
        double[] x = dataX[i];
        double yActual = dataY[i];
        double yPred = model.Predict(x);
        sum += (yActual - yPred) * (yActual - yPred);
      }
      return Math.Sqrt(sum / n);
    }

  } // Program

  public class LinearRidgeRegress
  {
    public double[]? coeffs;
    public double alpha;

    public LinearRidgeRegress(double alpha)
    {
      this.coeffs = null;
      this.alpha = alpha;  // ridge regression noise
    }

    public void Train(double[][] trainX, double[] trainY)
    {
      double[][] DX = 
        Utils.MatMakeDesign(trainX);  // add 1s column

      // coeffs = inv(DXt * DX) * DXt * Y 
      double[][] a = Utils.MatTranspose(DX);
      double[][] b = Utils.MatProduct(a, DX);

      for (int i = 0; i "lt" b.Length; ++i)  // ridge regression
        b[i][i] += this.alpha;

      double[][] c = Utils.MatInverse(b); 
      double[][] d = Utils.MatProduct(c, a);
      double[][] Y = Utils.VecToMat(trainY, DX.Length, 1);
      double[][] e = Utils.MatProduct(d, Y);
      this.coeffs = Utils.MatToVec(e);
    }

    public double Predict(double[] x)
    {
      // constant at coeffs[0]
      double sum = 0.0;
      int n = x.Length;  // number predictors
      for (int i = 0; i "lt" n; ++i)
        sum += x[i] * this.coeffs[i + 1];

      sum += this.coeffs[0];  // add the constant
      return sum;
    }

  } // class LinearRegression

  public class Utils
  {
    public static double[][] VecToMat(double[] vec,
      int nRows, int nCols)
    {
      double[][] result = MatCreate(nRows, nCols);
      int k = 0;
      for (int i = 0; i "lt" nRows; ++i)
        for (int j = 0; j "lt" nCols; ++j)
          result[i][j] = vec[k++];
      return result;
    }

    public static double[][] MatCreate(int rows, int cols)
    {
      double[][] result = new double[rows][];
      for (int i = 0; i "lt" rows; ++i)
        result[i] = new double[cols];
      return result;
    }

    public static double[] MatToVec(double[][] m)
    {
      int rows = m.Length;
      int cols = m[0].Length;
      double[] result = new double[rows * cols];
      int k = 0;
      for (int i = 0; i "lt" rows; ++i)
        for (int j = 0; j "lt" cols; ++j)
        {
          result[k++] = m[i][j];
        }
      return result;
    }

    public static double[][] MatProduct(double[][] matA,
      double[][] matB)
    {
      int aRows = matA.Length;
      int aCols = matA[0].Length;
      int bRows = matB.Length;
      int bCols = matB[0].Length;
      if (aCols != bRows)
        throw new Exception("Non-conformable matrices");

      double[][] result = MatCreate(aRows, bCols);

      for (int i = 0; i "lt" aRows; ++i) // each row of A
        for (int j = 0; j "lt" bCols; ++j) // each col of B
          for (int k = 0; k "lt" aCols; ++k)
            result[i][j] += matA[i][k] * matB[k][j];

      return result;
    }

    public static double[][] MatTranspose(double[][] m)
    {
      int nr = m.Length;
      int nc = m[0].Length;
      double[][] result = MatCreate(nc, nr);  // note
      for (int i = 0; i "lt" nr; ++i)
        for (int j = 0; j "lt" nc; ++j)
          result[j][i] = m[i][j];
      return result;
    }

    // -------

    public static double[][] MatInverse(double[][] m)
    {
      // assumes determinant is not 0
      // that is, the matrix does have an inverse
      int n = m.Length;
      double[][] result = MatCreate(n, n); // make a copy
      for (int i = 0; i "lt" n; ++i)
        for (int j = 0; j "lt" n; ++j)
          result[i][j] = m[i][j];

      double[][] lum; // combined lower & upper
      int[] perm;  // out parameter
      MatDecompose(m, out lum, out perm);  // ignore return

      double[] b = new double[n];
      for (int i = 0; i "lt" n; ++i)
      {
        for (int j = 0; j "lt" n; ++j)
          if (i == perm[j])
            b[j] = 1.0;
          else
            b[j] = 0.0;

        double[] x = Reduce(lum, b); // 
        for (int j = 0; j "lt" n; ++j)
          result[j][i] = x[j];
      }
      return result;
    }

    private static int MatDecompose(double[][] m,
      out double[][] lum, out int[] perm)
    {
      // Crout's LU decomposition for matrix determinant
      // and inverse.
      // stores combined lower & upper in lum[][]
      // stores row permuations into perm[]
      // returns +1 or -1 according to even or odd number
      // of row permutations.
      // lower gets dummy 1.0s on diagonal (0.0s above)
      // upper gets lum values on diagonal (0.0s below)

      // even (+1) or odd (-1) row permutatuions
      int toggle = +1;
      int n = m.Length;

      // make a copy of m[][] into result lu[][]
      lum = MatCreate(n, n);
      for (int i = 0; i "lt" n; ++i)
        for (int j = 0; j "lt" n; ++j)
          lum[i][j] = m[i][j];

      // make perm[]
      perm = new int[n];
      for (int i = 0; i "lt" n; ++i)
        perm[i] = i;

      for (int j = 0; j "lt" n - 1; ++j) // note n-1 
      {
        double max = Math.Abs(lum[j][j]);
        int piv = j;

        for (int i = j + 1; i "lt" n; ++i) // pivot index
        {
          double xij = Math.Abs(lum[i][j]);
          if (xij "gt" max)
          {
            max = xij;
            piv = i;
          }
        } // i

        if (piv != j)
        {
          double[] tmp = lum[piv]; // swap rows j, piv
          lum[piv] = lum[j];
          lum[j] = tmp;

          int t = perm[piv]; // swap perm elements
          perm[piv] = perm[j];
          perm[j] = t;

          toggle = -toggle;
        }

        double xjj = lum[j][j];
        if (xjj != 0.0)
        {
          for (int i = j + 1; i "lt" n; ++i)
          {
            double xij = lum[i][j] / xjj;
            lum[i][j] = xij;
            for (int k = j + 1; k "lt" n; ++k)
              lum[i][k] -= xij * lum[j][k];
          }
        }

      } // j

      return toggle;  // for determinant
    } // MatDecompose

    private static double[] Reduce(double[][] luMatrix,
      double[] b) // helper
    {
      int n = luMatrix.Length;
      double[] x = new double[n];
      //b.CopyTo(x, 0);
      for (int i = 0; i "lt" n; ++i)
        x[i] = b[i];

      for (int i = 1; i "lt" n; ++i)
      {
        double sum = x[i];
        for (int j = 0; j "lt" i; ++j)
          sum -= luMatrix[i][j] * x[j];
        x[i] = sum;
      }

      x[n - 1] /= luMatrix[n - 1][n - 1];
      for (int i = n - 2; i "gte" 0; --i)
      {
        double sum = x[i];
        for (int j = i + 1; j "lt" n; ++j)
          sum -= luMatrix[i][j] * x[j];
        x[i] = sum / luMatrix[i][i];
      }

      return x;
    } // Reduce

    // -------

    private static int NumNonCommentLines(string fn,
    string comment)
    {
      int ct = 0;
      string line = "";
      FileStream ifs = new FileStream(fn,
        FileMode.Open);
      StreamReader sr = new StreamReader(ifs);
      while ((line = sr.ReadLine()) != null)
        if (line.StartsWith(comment) == false)
          ++ct;
      sr.Close(); ifs.Close();
      return ct;
    }

    public static double[][] MatLoad(string fn,
      int[] usecols, char sep, string comment)
    {
      // count number of non-comment lines
      int nRows = NumNonCommentLines(fn, comment);
      int nCols = usecols.Length;
      double[][] result = MatCreate(nRows, nCols);
      string line = "";
      string[] tokens = null;
      FileStream ifs = new FileStream(fn, FileMode.Open);
      StreamReader sr = new StreamReader(ifs);

      int i = 0;
      while ((line = sr.ReadLine()) != null)
      {
        if (line.StartsWith(comment) == true)
          continue;
        tokens = line.Split(sep);
        for (int j = 0; j "lt" nCols; ++j)
        {
          int k = usecols[j];  // into tokens
          result[i][j] = double.Parse(tokens[k]);
        }
        ++i;
      }
      sr.Close(); ifs.Close();
      return result;
    }

    public static double[][] MatMakeDesign(double[][] m)
    {
      // add a leading column of 1s to create a design matrix
      int nRows = m.Length; int nCols = m[0].Length;
      double[][] result = MatCreate(nRows, nCols + 1);
      for (int i = 0; i "lt" nRows; ++i)
        result[i][0] = 1.0;

      for (int i = 0; i "lt" nRows; ++i)
      {
        for (int j = 0; j "lt" nCols; ++j)
        {
          result[i][j + 1] = m[i][j];
        }
      }
      return result;
    }

    public static void MatShow(double[][] m,
      int dec, int wid)
    {
      for (int i = 0; i "lt" m.Length; ++i)
      {
        for (int j = 0; j "lt" m[0].Length; ++j)
        {
          double v = m[i][j];
          if (Math.Abs(v) "lt" 1.0e-8) v = 0.0; // hack
          Console.Write(v.ToString("F" +
            dec).PadLeft(wid));
        }
        Console.WriteLine("");
      }
    }

    public static void VecShow(double[] vec,
      int dec, int wid, bool newLine)
    {
      for (int i = 0; i "lt" vec.Length; ++i)
      {
        double x = vec[i];
        if (Math.Abs(x) "lt" 1.0e-8) x = 0.0;  // hack
        Console.Write(x.ToString("F" +
          dec).PadLeft(wid));
      }
      if (newLine == true)
        Console.WriteLine("");
    }

  } // class Utils

} // ns

Training data.

-0.1660, 0.4406, -0.9998, 0.4487
-0.8153, -0.6275, -0.3089, 0.2757
-0.2065, 0.0776, -0.1616, 0.3096
0.7562, -0.9452, 0.3409, 0.2839
-0.1654, 0.1174, -0.7192, 0.4079
0.9365, -0.3732, 0.3846, 0.2787
0.7528, 0.7892, -0.8299, 0.4489
0.0663, 0.3838, -0.3690, 0.3740
0.3730, 0.6693, -0.9634, 0.4525
0.5786, -0.7935, -0.1042, 0.3348
0.8172, -0.4128, -0.4244, 0.3928
-0.4689, -0.0169, -0.8933, 0.4036
0.1482, -0.7065, 0.1786, 0.2809
-0.1719, 0.3888, -0.1716, 0.3385
-0.9001, 0.0718, 0.3276, 0.2309
0.1731, 0.8068, -0.7251, 0.4236
-0.7214, 0.6148, -0.2046, 0.3311
0.4520, 0.7666, 0.2473, 0.3204
0.5019, -0.3022, -0.4601, 0.3885
0.9297, 0.3269, 0.2434, 0.3353
-0.7705, 0.8990, -0.1002, 0.2941
-0.5259, 0.8068, 0.1474, 0.3022
-0.9943, 0.2343, -0.3467, 0.3144
-0.2855, 0.8171, 0.2467, 0.2934
-0.9684, 0.8589, 0.3818, 0.2229
0.5109, 0.5078, 0.8460, 0.2719
0.4230, -0.7515, -0.9602, 0.4135
0.0777, 0.1056, 0.6841, 0.2428
-0.7517, -0.4416, 0.1715, 0.2525
-0.9627, 0.6013, -0.5341, 0.3701
0.6142, -0.2243, 0.7271, 0.2510
-0.7271, -0.8802, -0.7573, 0.3173
-0.9109, -0.7850, -0.5486, 0.2943
-0.9749, -0.8561, 0.9346, 0.2487
0.1362, -0.5934, -0.4953, 0.3394
0.1627, 0.9400, 0.6937, 0.3020
-0.5203, -0.0125, 0.2399, 0.2501
-0.9628, -0.8600, -0.0273, 0.2715
0.2127, 0.1377, -0.3653, 0.3758
-0.2397, 0.1019, 0.4907, 0.2496

Test data.

0.3385, -0.4702, -0.8673, 0.4303
-0.5797, 0.5055, -0.8669, 0.4163
-0.4794, 0.6095, -0.6131, 0.4025
0.8496, -0.4734, -0.8681, 0.4909
0.4701, 0.5444, 0.8156, 0.2796
0.8980, 0.9004, 0.1133, 0.3295
0.8312, 0.2831, -0.2200, 0.4104
0.0991, 0.8524, 0.8375, 0.2744
-0.2102, 0.9265, -0.6521, 0.4078
0.8959, 0.6542, -0.9700, 0.4744
This entry was posted in Machine Learning. Bookmark the permalink.

1 Response to Linear Ridge Regression from Scratch Using C#

  1. Thorsten Kleppe says:

    Wow, your demos are insane! To sum up, you covered Linear Regression, Linear Ridge Regression, k-Nearest Neighbors Regression, Kernel Ridge Regression, Gaussian Process Regression, Decision Tree Regression, and Neural Network Regression. Phew, it would be interesting to see how all these techniques perform on a dataset after hyperparameter tuning. Crazy work! Also, yesterday’s demo with k = 9 was pretty good.

    Although I’m a bit behind, what do you think about this demo?
    https://github.com/grensen/ML_demos#simulated-annealing-with-lin-kernighan-optimization

Leave a comment