I’ve been looking at kernel ridge regression (KRR) from different points of view — different languages and different architectures. I recently put together a lightweight version of KRR using C# with a static function style. One weekend, just for fun, I refactored my C# code to use an OOP style.
The “kernel” in KRR means the technique uses the kernel function trick to deal with non-linear data. The “ridge” in KRR means that L2 regularization is applied to deter model overfitting.
First, I used a 5-10-1 neural network with random weights to create some synthetic data that looks like:
-0.1660 0.4406 -0.9998 -0.3953 -0.7065 0.4840 0.0776 -0.1616 0.3704 -0.5911 0.7562 0.1568 -0.9452 0.3409 -0.1654 0.1174 -0.7192 0.8054 . . .
The first five values on each line are the predictor variables, each between -1.0 and +1.0 to simulate normalized data. The last value on each line is the target value to predict, between 0 and 1. There are 40 training items and 10 test items. KRR works well with small datasets that are strictly numeric.
Creating and training the KRR model is nice and clean:
// load data into trainX and trainY Console.WriteLine("Creating KRR object"); double gamma = 0.1; // RBF param double alpha = 0.001; // regularization KRR krr = new KRR(gamma, alpha); krr.Train(trainX, trainY);
My demo uses a hard-coded radial basis function (RBF) with a form of rbf(x1, x2) = exp( -gamma * ||x1 – x2||^2 ) where x1 and x2 are vectors, gamma is a constant, and ||x1 – x2||^2 is squared Euclidean distance.
Using the model is simple too:
double[] x = new double[] { 0.5, -0.5, 0.5, -0.5, 0.5 }; double y = krr.Predict(x);
The trickiest part of working with KRR is finding good values for the parameters. I used a simple grid search:
double[] gammas = new double[] { 0.1, 0.2, 0.3, 0.5, 1.0, 1.5, 2.0 }; double[] alphas = new double[] { 0.0, 0.0001, 0.001, 0.01, 0.05, 0.1, 0.5 }; foreach (double g in gammas) { foreach (double a in alphas) { KRR krr = new KRR(g, a); krr.Train(trainX, trainY); double trainRMSE = RootMSE(krr, trainX, trainY); double testRMSE = RootMSE(krr, testX, testY); double trainAcc = Accuracy(krr, trainX, trainY, 0.10); double testAcc = Accuracy(krr, testX, testY, 0.10); Console.WriteLine("gamma = " + g.ToString("F1") + " alpha = " + a.ToString("F4") + " train rmse = " + trainRMSE.ToString("F4") + " test rmse = " + testRMSE.ToString("F4") + " train acc = " + trainAcc.ToString("F4") + " test acc = " + testAcc.ToString("F4")); } }
The idea is to balance root mean squared error, which is very granular, with accuracy, which is coarser.
Mucho fun!
The movie “Alien Nation” (1988) tells the story of 300,000 aliens who settle in Los Angeles. A pretty clever movie in my opinion. The movie led to a TV series. The alien language is called Tenctonese. Because there’s a one-to-one mapping between English letters and digits with Tenctonese letters and digits, it’d be possible to create a kernel ridge regression program written in Tenctonese.
Demo code. Very long! Replace “lt”, “gt”, “lte”, “gte” with Boolean operator symbols.
using System; using System.IO; namespace KernelRidgeRegression { internal class KernelRidgeRegressionProgram { static void Main(string[] args) { Console.WriteLine("\nBegin C# kernel ridge regression "); Console.WriteLine("\nLoading train and test data "); string trainFile = "..\\..\\..\\Data\\synthetic_train.txt"; double[][] trainX = Utils.MatLoad(trainFile, new int[] { 0, 1, 2, 3, 4 }, '\t', "#"); // 40 items double[] trainY = Utils.MatToVec(Utils.MatLoad(trainFile, new int[] { 5 }, '\t', "#")); string testFile = "..\\..\\..\\Data\\synthetic_test.txt"; double[][] testX = Utils.MatLoad(testFile, new int[] { 0, 1, 2, 3, 4 }, '\t', "#"); // 10 items double[] testY = Utils.MatToVec(Utils.MatLoad(testFile, new int[] { 5 }, '\t', "#")); Console.WriteLine("Done "); Console.WriteLine("\nFirst three X predictors: "); for (int i = 0; i "lt" 3; ++i) Utils.VecShow(trainX[i], 4, 9, true); Console.WriteLine("\nFirst three target y: "); for (int i = 0; i "lt" 3; ++i) Console.WriteLine(trainY[i].ToString("F4").PadLeft(8)); // explore hyperparams //double[] gammas = new double[] // { 0.1, 0.2, 0.3, 0.5, 1.0, 1.5, 2.0 }; //double[] alphas = new double[] // { 0.0, 0.0001, 0.001, 0.01, 0.05, 0.1, 0.5 }; //foreach (double g in gammas) //{ // foreach (double a in alphas) // { // KRR krr = new KRR(g, a); // krr.Train(trainX, trainY); // double trainRMSE = RootMSE(krr, trainX, trainY); // double testRMSE = RootMSE(krr, testX, testY); // double trainAcc = Accuracy(krr, trainX, trainY, 0.10); // double testAcc = Accuracy(krr, testX, testY, 0.10); // Console.WriteLine("gamma = " + g.ToString("F1") + // " alpha = " + a.ToString("F4") + // " train rmse = " + trainRMSE.ToString("F4") + // " test rmse = " + testRMSE.ToString("F4") + // " train acc = " + trainAcc.ToString("F4") + // " test acc = " + testAcc.ToString("F4")); // } //} // gamma = 0.1 alpha = 0.001 // train rmse = 0.0030 test rmse = 0.0308 // train acc = 1.0000 test acc = 0.8000 Console.WriteLine("\nCreating KRR object"); double gamma = 0.1; // RBF param double alpha = 0.001; // regularization Console.WriteLine("Setting RBF gamma = " + gamma.ToString("F1")); Console.WriteLine("Setting alpha noise = " + alpha.ToString("F3")); KRR krr = new KRR(gamma, alpha); Console.WriteLine("Done "); Console.WriteLine("\nTraining model "); krr.Train(trainX, trainY); Console.WriteLine("Done "); // Console.WriteLine("Model weights: "); // 1 per trainX // Utils.VecShow(krr.wts, 4, 9, true); Console.WriteLine("\nComputing model accuracy" + " (within 0.10) "); double trainAcc = Accuracy(krr, trainX, trainY, 0.10); double testAcc = Accuracy(krr, testX, testY, 0.10); Console.WriteLine("\nTrain acc = " + trainAcc.ToString("F4")); Console.WriteLine("Test acc = " + testAcc.ToString("F4")); double trainRMSE = RootMSE(krr, trainX, trainY); double testRMSE = RootMSE(krr, testX, testY); Console.WriteLine("\nTrain RMSE = " + trainRMSE.ToString("F4")); Console.WriteLine("Test RMSE = " + testRMSE.ToString("F4")); Console.WriteLine("\nPredicting for x = " + "[0.5, -0.5, 0.5, -0.5, 0.5] "); double[] x = new double[] { 0.5, -0.5, 0.5, -0.5, 0.5 }; double y = krr.Predict(x); // note X vector Console.WriteLine("Predicted y = " + y.ToString("F4")); Console.WriteLine("\nEnd KRR demo "); Console.ReadLine(); } // Main() static double Accuracy(KRR model, double[][] dataX, double[] dataY, double pctClose) { int numCorrect = 0; int numWrong = 0; int n = dataX.Length; for (int i = 0; i "lt" n; ++i) { double[] x = dataX[i]; double actualY = dataY[i]; double predY = model.Predict(x); if (Math.Abs(actualY - predY) "lt" Math.Abs(actualY * pctClose)) ++numCorrect; else ++numWrong; } return (numCorrect * 1.0) / n; } static double RootMSE(KRR model, double[][] dataX, double[] dataY) { double sum = 0.0; int n = dataX.Length; for (int i = 0; i "lt" n; ++i) { double[] x = dataX[i]; double actualY = dataY[i]; double predY = model.Predict(x); sum += (actualY - predY) * (actualY - predY); } return Math.Sqrt(sum / n); } } // class Program public class KRR { public double gamma; // for RBF kernel public double alpha; // regularization public double[][] trainX; // need for any prediction public double[] wts; // one per trainX item public KRR(double gamma, double alpha) { this.gamma = gamma; this.alpha = alpha; } // ctor public void Train(double[][] trainX, double[] trainY) { // 0. store trainX -- needed by Predict() this.trainX = trainX; // by ref -- could copy // 1. compute train-train K matrix int N = trainX.Length; double[][] K = Utils.MatCreate(N, N); for (int i = 0; i "lt" N; ++i) for (int j = 0; j "lt" N; ++j) K[i][j] = this.Rbf(trainX[i], trainX[j]); // 2. add regularization on diagonal for (int i = 0; i "lt" N; ++i) K[i][i] += this.alpha; // 3. compute model weights using K inverse double[][] Kinv = Utils.MatInverse(K); this.wts = Utils.VecMatProd(trainY, Kinv); // alt approach //double[][] tmp1 = Utils.VecToMat(trainY, 1, N); //double[][] tmp2 = Utils.MatProduct(tmp1, Kinv); //this.wts = Utils.MatToVec(tmp2); } // Train public double Rbf(double[] v1, double[] v2) { // the gamma version aot len_scale version int dim = v1.Length; double sum = 0.0; for (int i = 0; i "lt" dim; ++i) { sum += (v1[i] - v2[i]) * (v1[i] - v2[i]); } return Math.Exp(-1 * this.gamma * sum); // before Exp() } public double Predict(double[] x) { int N = this.trainX.Length; double sum = 0.0; for (int i = 0; i "lt" N; ++i) { double[] xx = this.trainX[i]; double k = this.Rbf(x, xx); sum += this.wts[i] * k; } return sum; } } // class KRR public class Utils { public static double[][] MatCreate(int rows, int cols) { double[][] result = new double[rows][]; for (int i = 0; i "lt" rows; ++i) result[i] = new double[cols]; return result; } public static double[] MatToVec(double[][] m) { int rows = m.Length; int cols = m[0].Length; double[] result = new double[rows * cols]; int k = 0; for (int i = 0; i "lt" rows; ++i) for (int j = 0; j "lt" cols; ++j) { result[k++] = m[i][j]; } return result; } public static double[][] MatProduct(double[][] matA, double[][] matB) { int aRows = matA.Length; int aCols = matA[0].Length; int bRows = matB.Length; int bCols = matB[0].Length; if (aCols != bRows) throw new Exception("Non-conformable matrices"); double[][] result = MatCreate(aRows, bCols); for (int i = 0; i "lt" aRows; ++i) // each row of A for (int j = 0; j "lt" bCols; ++j) // each col of B for (int k = 0; k "lt" aCols; ++k) result[i][j] += matA[i][k] * matB[k][j]; return result; } public static double[][] VecToMat(double[] vec, int nRows, int nCols) { double[][] result = MatCreate(nRows, nCols); int k = 0; for (int i = 0; i "lt" nRows; ++i) for (int j = 0; j "lt" nCols; ++j) result[i][j] = vec[k++]; return result; } public static double[] VecMatProd(double[] v, double[][] m) { // one-dim vec * two-dim mat int nRows = m.Length; int nCols = m[0].Length; int n = v.Length; if (n != nCols) throw new Exception("non-comform in VecMatProd"); double[] result = new double[n]; for (int i = 0; i "lt" n; ++i) { for (int j = 0; j "lt" nCols; ++j) { result[i] += v[j] * m[i][j]; } } return result; } //public static double[] MatVecProd(double[][] mat, // double[] vec) //{ // int nRows = mat.Length; int nCols = mat[0].Length; // if (vec.Length != nCols) // throw new Exception("Non-conforme in MatVecProd() "); // double[] result = new double[nRows]; // for (int i = 0; i "lt" nRows; ++i) // { // for (int j = 0; j "lt" nCols; ++j) // { // result[i] += mat[i][j] * vec[j]; // } // } // return result; //} public static double[][] MatTranspose(double[][] m) { int nr = m.Length; int nc = m[0].Length; double[][] result = MatCreate(nc, nr); // note for (int i = 0; i "lt" nr; ++i) for (int j = 0; j "lt" nc; ++j) result[j][i] = m[i][j]; return result; } // ------- public static double[][] MatInverse(double[][] m) { // assumes determinant is not 0 // that is, the matrix does have an inverse int n = m.Length; double[][] result = MatCreate(n, n); // make a copy for (int i = 0; i "lt" n; ++i) for (int j = 0; j "lt" n; ++j) result[i][j] = m[i][j]; double[][] lum; // combined lower & upper int[] perm; // out parameter MatDecompose(m, out lum, out perm); // ignore return double[] b = new double[n]; for (int i = 0; i "lt" n; ++i) { for (int j = 0; j "lt" n; ++j) if (i == perm[j]) b[j] = 1.0; else b[j] = 0.0; double[] x = Reduce(lum, b); // for (int j = 0; j "lt" n; ++j) result[j][i] = x[j]; } return result; } static int MatDecompose(double[][] m, out double[][] lum, out int[] perm) { // Crout's LU decomposition for matrix determinant // and inverse. // stores combined lower & upper in lum[][] // stores row permuations into perm[] // returns +1 or -1 according to even or odd number // of row permutations. // lower gets dummy 1.0s on diagonal (0.0s above) // upper gets lum values on diagonal (0.0s below) // even (+1) or odd (-1) row permutatuions int toggle = +1; int n = m.Length; // make a copy of m[][] into result lu[][] lum = MatCreate(n, n); for (int i = 0; i "lt" n; ++i) for (int j = 0; j "lt" n; ++j) lum[i][j] = m[i][j]; // make perm[] perm = new int[n]; for (int i = 0; i "lt" n; ++i) perm[i] = i; for (int j = 0; j "lt" n - 1; ++j) // note n-1 { double max = Math.Abs(lum[j][j]); int piv = j; for (int i = j + 1; i "lt" n; ++i) // pivot index { double xij = Math.Abs(lum[i][j]); if (xij "gt" max) { max = xij; piv = i; } } // i if (piv != j) { double[] tmp = lum[piv]; // swap rows j, piv lum[piv] = lum[j]; lum[j] = tmp; int t = perm[piv]; // swap perm elements perm[piv] = perm[j]; perm[j] = t; toggle = -toggle; } double xjj = lum[j][j]; if (xjj != 0.0) { for (int i = j + 1; i "lt" n; ++i) { double xij = lum[i][j] / xjj; lum[i][j] = xij; for (int k = j + 1; k "lt" n; ++k) lum[i][k] -= xij * lum[j][k]; } } } // j return toggle; // for determinant } // MatDecompose static double[] Reduce(double[][] luMatrix, double[] b) // helper { int n = luMatrix.Length; double[] x = new double[n]; //b.CopyTo(x, 0); for (int i = 0; i "lt" n; ++i) x[i] = b[i]; for (int i = 1; i "lt" n; ++i) { double sum = x[i]; for (int j = 0; j "lt" i; ++j) sum -= luMatrix[i][j] * x[j]; x[i] = sum; } x[n - 1] /= luMatrix[n - 1][n - 1]; for (int i = n - 2; i "gte" 0; --i) { double sum = x[i]; for (int j = i + 1; j "lt" n; ++j) sum -= luMatrix[i][j] * x[j]; x[i] = sum / luMatrix[i][i]; } return x; } // Reduce //public static double MatDeterminant(double[][] m) //{ // double[][] lum; // int[] perm; // double result = MatDecompose(m, out lum, out perm); // for (int i = 0; i "lt" lum.Length; ++i) // result *= lum[i][i]; // return result; //} // ------- static int NumNonCommentLines(string fn, string comment) { int ct = 0; string line = ""; FileStream ifs = new FileStream(fn, FileMode.Open); StreamReader sr = new StreamReader(ifs); while ((line = sr.ReadLine()) != null) if (line.StartsWith(comment) == false) ++ct; sr.Close(); ifs.Close(); return ct; } public static double[][] MatLoad(string fn, int[] usecols, char sep, string comment) { // count number of non-comment lines int nRows = NumNonCommentLines(fn, comment); int nCols = usecols.Length; double[][] result = MatCreate(nRows, nCols); string line = ""; string[] tokens = null; FileStream ifs = new FileStream(fn, FileMode.Open); StreamReader sr = new StreamReader(ifs); int i = 0; while ((line = sr.ReadLine()) != null) { if (line.StartsWith(comment) == true) continue; tokens = line.Split(sep); for (int j = 0; j "lt" nCols; ++j) { int k = usecols[j]; // into tokens result[i][j] = double.Parse(tokens[k]); } ++i; } sr.Close(); ifs.Close(); return result; } //public static double[][] MatMakeDesign(double[][] m) //{ // // add a leading column of 1s to create a design matrix // int nRows = m.Length; int nCols = m[0].Length; // double[][] result = MatCreate(nRows, nCols + 1); // for (int i = 0; i "lt" nRows; ++i) // result[i][0] = 1.0; // for (int i = 0; i "lt" nRows; ++i) // { // for (int j = 0; j "lt" nCols; ++j) // { // result[i][j + 1] = m[i][j]; // } // } // return result; //} public static void MatShow(double[][] m, int dec, int wid) { for (int i = 0; i "lt" m.Length; ++i) { for (int j = 0; j "lt" m[0].Length; ++j) { double v = m[i][j]; if (Math.Abs(v) "lt" 1.0e-8) v = 0.0; // hack Console.Write(v.ToString("F" + dec).PadLeft(wid)); } Console.WriteLine(""); } } //public static void VecShow(int[] vec, int wid) //{ // for (int i = 0; i "lt" vec.Length; ++i) // Console.Write(vec[i].ToString().PadLeft(wid)); // Console.WriteLine(""); //} public static void VecShow(double[] vec, int dec, int wid, bool newLine) { for (int i = 0; i "lt" vec.Length; ++i) { double x = vec[i]; if (Math.Abs(x) "lt" 1.0e-8) x = 0.0; // hack Console.Write(x.ToString("F" + dec).PadLeft(wid)); } if (newLine == true) Console.WriteLine(""); } } // class Utils } // ns
Training data. Replace commas with tab chars or modify program.
-0.1660, 0.4406, -0.9998, -0.3953, -0.7065, 0.4840 0.0776, -0.1616, 0.3704, -0.5911, 0.7562, 0.1568 -0.9452, 0.3409, -0.1654, 0.1174, -0.7192, 0.8054 0.9365, -0.3732, 0.3846, 0.7528, 0.7892, 0.1345 -0.8299, -0.9219, -0.6603, 0.7563, -0.8033, 0.7955 0.0663, 0.3838, -0.3690, 0.3730, 0.6693, 0.3206 -0.9634, 0.5003, 0.9777, 0.4963, -0.4391, 0.7377 -0.1042, 0.8172, -0.4128, -0.4244, -0.7399, 0.4801 -0.9613, 0.3577, -0.5767, -0.4689, -0.0169, 0.6861 -0.7065, 0.1786, 0.3995, -0.7953, -0.1719, 0.5569 0.3888, -0.1716, -0.9001, 0.0718, 0.3276, 0.2500 0.1731, 0.8068, -0.7251, -0.7214, 0.6148, 0.3297 -0.2046, -0.6693, 0.8550, -0.3045, 0.5016, 0.2129 0.2473, 0.5019, -0.3022, -0.4601, 0.7918, 0.2613 -0.1438, 0.9297, 0.3269, 0.2434, -0.7705, 0.5171 0.1568, -0.1837, -0.5259, 0.8068, 0.1474, 0.3307 -0.9943, 0.2343, -0.3467, 0.0541, 0.7719, 0.5581 0.2467, -0.9684, 0.8589, 0.3818, 0.9946, 0.1092 -0.6553, -0.7257, 0.8652, 0.3936, -0.8680, 0.7018 0.8460, 0.4230, -0.7515, -0.9602, -0.9476, 0.1996 -0.9434, -0.5076, 0.7201, 0.0777, 0.1056, 0.5664 0.9392, 0.1221, -0.9627, 0.6013, -0.5341, 0.1533 0.6142, -0.2243, 0.7271, 0.4942, 0.1125, 0.1661 0.4260, 0.1194, -0.9749, -0.8561, 0.9346, 0.2230 0.1362, -0.5934, -0.4953, 0.4877, -0.6091, 0.3810 0.6937, -0.5203, -0.0125, 0.2399, 0.6580, 0.1460 -0.6864, -0.9628, -0.8600, -0.0273, 0.2127, 0.5387 0.9772, 0.1595, -0.2397, 0.1019, 0.4907, 0.1611 0.3385, -0.4702, -0.8673, -0.2598, 0.2594, 0.2270 -0.8669, -0.4794, 0.6095, -0.6131, 0.2789, 0.4700 0.0493, 0.8496, -0.4734, -0.8681, 0.4701, 0.3516 0.8639, -0.9721, -0.5313, 0.2336, 0.8980, 0.1412 0.9004, 0.1133, 0.8312, 0.2831, -0.2200, 0.1782 0.0991, 0.8524, 0.8375, -0.2102, 0.9265, 0.2150 -0.6521, -0.7473, -0.7298, 0.0113, -0.9570, 0.7422 0.6190, -0.3105, 0.8802, 0.1640, 0.7577, 0.1056 0.6895, 0.8108, -0.0802, 0.0927, 0.5972, 0.2214 0.1982, -0.9689, 0.1870, -0.1326, 0.6147, 0.1310 -0.3695, 0.7858, 0.1557, -0.6320, 0.5759, 0.3773 -0.1596, 0.3581, 0.8372, -0.9992, 0.9535, 0.2071
Test data.
-0.2468, 0.9476, 0.2094, 0.6577, 0.1494, 0.4132 0.1737, 0.5000, 0.7166, 0.5102, 0.3961, 0.2611 0.7290, -0.3546, 0.3416, -0.0983, -0.2358, 0.1332 -0.3652, 0.2438, -0.1395, 0.9476, 0.3556, 0.4170 -0.6029, -0.1466, -0.3133, 0.5953, 0.7600, 0.4334 -0.4596, -0.4953, 0.7098, 0.0554, 0.6043, 0.2775 0.1450, 0.4663, 0.0380, 0.5418, 0.1377, 0.2931 -0.8636, -0.2442, -0.8407, 0.9656, -0.6368, 0.7429 0.6237, 0.7499, 0.3768, 0.1390, -0.6781, 0.2185 -0.5499, 0.1850, -0.3755, 0.8326, 0.8193, 0.4399
You must be logged in to post a comment.