The conjugate gradient algorithm is a technique that can be used to estimate the solution to a set of simultaneous equations. Just for fun, I coded up the example problem given in the Wikipedia article on the topic.

Suppose

```4x0 +  x1  = 1
x0 + 3x1  = 2
```

The solution is (x0, x1) = (0.0909, 0.6364). Conjugate gradient is an iterative algorithm and can in theory solve a system of n linear equations in n steps. In practice, because of round off, conjugate gradient will get very close to the exact answer in n steps but may not solve the problem exactly. In many situations, a solution that’s accurate to 7 or 10 decimals is close enough.

Anyway, I coded up the Wikipedia example problem using raw C#. The image on this post shows the mapping between my demo program and the Wikipedia example.

All in all, it was a very fun little problem. I’m mostly a software developer who has a moderate background in research. I can understand machine learning theory (as in the Wikipedia article) when I read it, but only up to a point. To fully understand an algorithm, I have to code it for myself.

Another point of the blog post is that for some topics, Wikipedia is very good and becomes the de facto basic reference. Conjugate gradient falls into this category. But there are many machine learning topics where the Wikipedia entry is really, really bad so you have to be cautious.