A standard neural network classifier builds a model that predicts output values from input values. For example, the famous Iris Data has 150 items. Each item has four predictor variables (sepal length, sepal width, petal length, petal width) followed by one of three species to predict: setosa encoded as (1,0,0), versicolor encoded as (0,1,0), and virginica encoded as (0,0,1). The first item in the set is:
5.1, 3.5, 1.4, 0.2, 1, 0, 0
You train the neural classifier to find the defining weight constants so that given an input set of four values, the model correctly predicts the species.
A replicator neural network builds a model that predicts its own inputs. This sounds strange at first, but I’ll explain the point shortly. For the Iris Data, you’d take the data for one of the three species (say, setosa), remove the encoded labels to predict. The idea is to feed the replicator NN the four inputs and have the model spit back the same four values. For example, conceptually, the first line of a training data file would be:
5.1, 3.5, 1.4, 0.2, 5.1, 3.5, 1.4, 0.2
The first four values act as inputs and the next set of four values act as the targets. Of course even though you could explicitly duplicate the outputs, there’s no need to do so because you can duplicate the values programmatically. because they’re the same.
(Click image to enlarge)
So, what’s the point? A replicator neural network can be used for anomaly detection. For example, if the data is some sort of network packet data, then you have tons of “normal” data. You create a replicator NN. Now when new data comes in, you pass it to the replicator. If the replicator NN doesn’t predict the packet data closely enough (defining what that means is the hard part), then the incoming packet might be malicious.
I coded up a short demo using raw Python. Good fun!
The moral of the story is that getting and using training data that is labeled (called supervised training) — and so has known correct output values — is time-consuming and difficult. Replicator NNs are an example of a machine learning technique that doesn’t need labeled data (unsupervised training).