I heard an interesting talk recently about Bayesian neural networks (BNNs). A BNN is a bit tricky to explain. Briefly, suppose you have a regular neural network for regression — meaning the output is a single numeric value. For example, you might predict the annual income of a person based on their age, sex, years of education, and so on.
A regular NN spits out just a single value, like 43.00 (for $43,000 per year). But that doesn’t give you any indication of how confident the prediction is. Or, put another way, there’s no interval prediction such as “43.00 plus or minus 4.5”.
Instead of predicting a single numeric output value, a BNN predicts a probability distribution. This is very hard to do computationally. For example, instead of minimizing mean squared error as you’d do with a regular NN, you could minimize the Kullback Leibler divergence metric, which is a measure of the difference between two probability distributions (well, sort of).
I’m not impressed with Bayesian neural networks at this point. For sure, I know very little about them, but on the surface BNNs have the feel of a solution in search of a problem. From my days and experience in academia, I’ve seen this phenomenon all too often. But I’ll withhold a strong opinion on Bayesian neural networks for a while, until I learn more.