Normalizing and Encoding Neural Network Data

When I was first learning about neural networks, one topic that gave me difficulty was understanding how and when to manipulate the training data. There were many individual examples but they were all independent and it took me some time to piece all the information together.

I wrote an article “How to Standardize Data for Neural Networks” in the January 2014 issue of Visual Studio Magazine that summarizes all the guidelines for normalizing numeric data and encoding categorical data. See

For example, suppose you are trying to predict a person’s political party affiliation (democrat, republican, independent, other) from age (a number between 18 and 120), sex (male or female), annual income (a number like 45,000.00), and location (urban, suburban, rural). So, the first line of training data might be:

30    male    $38,000.00    urban     democrat 

Because neural networks work only with numbers, the training data must be converted to give something like:

-1.23  -1.0  -1.34  ( 0.0   1.0)  (0.0  0.0  0.0  1.0)

Here the age of 30 has been normalized to -1.23, the sex of male has been encoded as -1.0, the income of $38,000.00 has been normalized to -1.34, location of urban has been encoded as (0.0, 1.0), and the y-value of democrat has been encoded as (0.0, 0.0, 0.0, 1.0).


This entry was posted in Machine Learning. Bookmark the permalink.

One Response to Normalizing and Encoding Neural Network Data

  1. Seth Juarez has done a excellent work on this issue of data for neural network using POCO, please contact him.
    See his project

Comments are closed.