## Matrix Multiplication in Parallel with C# and the TPL

Recently I was working with matrices. In general I prefer writing my own custom matrix methods rather than using pre-written code libraries because I like having total understanding and control over my code (which usually happens only when I write the code myself). For years I’ve known that in principle, matrix multiplication can be performed in parallel. But until .NET 4.0, coding a parallelized version of matrix multiplication was tricky and difficult. However, the release of the Task Parallel Library, in the System.Threading.Tasks namespace in .NET 4.0, makes easy parallelization a reality.

Suppose you implement a matrix in C# as an array of arrays. The parallelization I’ll demo shortly will also work if you use true multidimensional arrays. For example:

```static double[][] MatrixCreate(int rows, int cols)
{
// do error checking here
double[][] result = new double[rows][];
for (int i = 0; i < rows; ++i)
result[i] = new double[cols];
return result;
}
```

A traditional approach to matrix multiplication is:

```static double[][] MatrixProduct(double[][] matrixA,
double[][] matrixB)
{
int aRows = matrixA.Length; int aCols = matrixA[0].Length;
int bRows = matrixB.Length; int bCols = matrixB[0].Length;
if (aCols != bRows)
throw new Exception("xxx");

double[][] result = MatrixCreate(aRows, bCols);

for (int i = 0; i < aRows; ++i) // each row of A
for (int j = 0; j < bCols; ++j) // each col of B
for (int k = 0; k < aCols; ++k) // could use k < bRows
result[i][j] += matrixA[i][k] * matrixB[k][j];
return result;
}
```

Calling the code could resemble:

```double[][] ma = MatrixCreate(3,4);
double[][] mb = MatrixCreate(4,2);
// assign values to ma and mb
double[][] prod = MatrixProduct(ma, mb);
```

Matrix multiplication is O(n^3) and therefore can be time-consuming. For example, if some matrix A has size 300 x 400, and matrix B has size 400 x 200, there’d be 300 x 400 x 200 = 24,000,000 type double multiplication operations. A simple way to parallelize matrix multiplication is:

```static double[][] MatrixProduct(double[][] matrixA,
double[][] matrixB)
{
int aRows = matrixA.Length; int aCols = matrixA[0].Length;
int bRows = matrixB.Length; int bCols = matrixB[0].Length;
if (aCols != bRows)
throw new Exception("xxxx");

double[][] result = MatrixCreate(aRows, bCols);

Parallel.For(0, aRows, i =>
{
for (int j = 0; j < bCols; ++j) // each col of B
for (int k = 0; k < aCols; ++k) // could use k < bRows
result[i][j] += matrixA[i][k] * matrixB[k][j];
}
);

return result;
}
```

The Parallel.For statement can be loosely interpreted as, “Divide up all statements that use i, where i ranges from 0 to aRows, and perform them separately on different processors.”

All of the ugly plumbing details are abstracted away, and as a developer all I need to know is some slightly new syntax. I’ve left out a ton of important details but the main idea is simple and powerful.