Estimating a Polynomial Function using a Neural Network

One of my colleagues (Kirk O.) recently posed this challenge to me: create a neural network that can calculate the area of a triangle. I mentally scoffed. How hard could it be?

Well, several hours later, I had a working NN system that can calculate the area of a triangle but it was more difficult than I expected. The area of a triangle is one-half the base times the height. The “one-half” is just a constant so in reality the challenge is to compute f(x,y) = x * y or more generally, compute a polynomial.

I dove in a wrote some code but I really had to experiment with the hyperparameters, especially the number of hidden nodes to use, the learning rate, and the momentum factor. In the end, I used a 2-5-1 network — two input nodes for the base and the height, five hidden processing nodes, and a single output node for the area.

I used tanh for the hidden layer activation function. I didn’t need an output layer activation function because this is a regression problem. I had to use a very small learning rate (0.0001) and a momentum factor = 0.0 to get good results.

There’s always a problem trying to determine model accuracy when dealing with regression. Here I cheated a bit by counting an area that’s within 0.5 of the correct result as a correct prediction. I also fudged by limiting the base and height to be less than 10.

The demo NN got 91.70% accuracy on its 1000-item training data set, and then 97.00% accuracy on a 100-item test set. It’s unusual to have better accuracy on the test data than on the training data but with NNs weird things happen.

So, thanks Kirk, for making me toss a couple hours of my life away. No, actually, it was really a fun little problem and I learned some valuable tricks.

This entry was posted in Machine Learning. Bookmark the permalink.

7 Responses to Estimating a Polynomial Function using a Neural Network

  1. BoilingCoder says:

    i wonder how would you solve normalization of such data.
    As such a problem solution should be able to work on any triangle side length’s

  2. Peter Boos says:

    try it with 3 nodes 2 input 1 output… if one can set the weights manual it works, if it doesnt work then its the training algorithm. Those wordpress articles would be better with some proof of code.

  3. Hi James,
    I’m working on a similar problem for about 2 days now. I have a bunch of input data that should output a single numeric response. Can you point me to more information on this:
    “I used tanh for the hidden layer activation function. I didnโ€™t need an output layer activation function because this is a regression problem.”
    I suspect that it might be part of my problem.

  4. James,
    Thanks again for all the insights, i’m learning a ton. Have you ever written an article on tips for adjusting/training a NN? I see a lot of times you say, “Found by trial and error”. I can see how if results are stuck or oscillating too much I can adjust momentum and learning rate. I’m pretty happy with these two parameters; however, i see regardless of inputs I generally get the same result (say 77 out of 80 items output the same regardless of inputs). I’m slowly going through and adjusting a variety of values including:
    * adding more training iterations
    * adding more training data
    * adding to the hidden layer
    Do you have any articles or suggestions?

Comments are closed.