What is a Self-Organizing Map?

A few days ago, one of my colleagues asked what a self-organizing map (SOM) is. SOMs are a bit tricky to explain because there are many variations. The simplest possible type of SOM is illustrated by the diagram below.

Suppose you have 100 data items, indexed from 0 to 99. Each data items has four numeric values, for example (4.0, 9.0, 0.0, 6.0). If you want to make a self-organizing map for this data, you first create a map shape. In the example, the map is 3×3 and is said to have nine nodes.

Next you use a training algorithm that repeatedly iterates through the 100 data items and computes nine weight vectors, one for each node, as shown in the red in the diagram. After you finish computing the node weight vectors, each one of the 100 data items is assigned to one node, where an item is assigned to the node that has a weight vector that’s closest to the data item

In the diagram, the first data item, [0] = (1.0, 1.0, 3.0, 2.0) is assigned to the map node in the upper left corner that has weight vector = (1.0, 2.0, 2.0, 3.0). Notice that many data items may be assigned to a map node, and (although it’s not shown in the diagram), it’s possible, but unlikely, for a map node to have no associated data items.

OK, so what’s the point? SOMs can be used for three purposes: clustering, dimensionality reduction, and classification.

For the simplest type of SOM shown here, a SOM performs data clustering where the distance relationships between data items is somewhat preserved. So, you can sort of think of this kind of SOM as a granular clustering technique.

Suppose the source data has a label attached, for example:

[0]   1.0  1.0  3.0  2.0  A
[1]   8.0  7.0  7.0  6.0  C
[2]   5.0  5.0  4.0  5.0  B
. . .
[99]  9.0  8.0  6.0  7.0  C

In this case you can map the labels to colors and then display the colors in one of several ways. What you’ve done here is use the SOM for dimensionality reduction: the original dataset has 4 dimensions but you’ve reduced it to 2 dimensions so the data can be graphed.

Finally, a third way to use a SOM is for classification. Suppose you get a new, previously unseen data item = (4.0, 9.0, 1.0, 1.0). You compute the map node where the new item belongs, then determine what class, A, B, or C, the map node is, and that’s the predicted class.

I’ve obviously left out a ton of details, but briefly, this post has described is what a SOM is. SOMs aren’t used very often. I suspect one reason is that SOMs are so multipurpose, they’re a bit difficult to conceptually grasp.

I have always loved maps of every kind. Here’s a map of Neverland, with locations shown in the 1953 Disney animated film “Peter Pan”.

This entry was posted in Machine Learning. Bookmark the permalink.