Cross Entropy Error and Logarithmic Scoring Rule

I was working on a “computational economics” project recently. That project used something called the logarithmic scoring rule. I noticed that the logarithmic scoring rule is essentially the negative of something called cross entropy error, which is often used with neural networks.

The math definition of cross entropy error is:

Where p is a set of actual probabilities and q is a set of predicted probabilities. In the case of a neural network classifier, suppose a set of predicted probabilities is q = (0.2, 0.7, 0.1) and the set of actual probabilities is p = (0, 1, 0). This corresponds to a classification problem with three possible outcomes, such as “democrat”, “republican”, “independent” and the actual outcome for a training item is the middle value, “republican”.

The cross entropy error for this example is:

```C = - (0 * log(0.2) +
1 * log(0.7) +
0 * log(0.1))

= -log(0.7)
= 0.3567
```

Notice that for a classification problem, all the actual probabilities will be 0 except for one probability, which will have value 1. So all the terms in the equation will drop out except one. Also, because of the negative sign, cross entropy error will always be positive (or zero).

The math definition of the logarithmic scoring rule is:

Where qi is the predicted probability of the event that actually occurred. So, for the set of p values and q values above,

```L = log(0.7)
= -0.3567
```

So, the logarithmic scoring rule is the negative of the cross entropy error in situations where exactly one outcome occurs.