## The Difference Between Categorical, Multinomial, Bernoulli, and Gaussian Naive Bayes Classification

The naive Bayes (NB) technique is a machine learning approach for classification. There are four main types of NB that vary according to the type of data they work with. All four variations of NB can work with binary classification (e.g, predict the sex of a person) or with multi-class classification (e.g, predict the State a person lives in).

Briefly, the four types of NB are:

1. Categorical: the predictors are all categorical, like “red” or “blue”.
2. Multinomial: the predictors are all integer counts.
3. Bernoulli: the predictors are all Boolean/binary.
4. Gaussian: the predictors are all numeric.

Here are some examples.

```1. Categorical NB
Predict a person's job type from State, sex, race

georgia  male    white  management
oregon   female  asian  sales
. . .
```
```2. Multinomial NB
Predict a college course type from counts of each A-F grade

5  9  16  3  1  mathematics
4  7  11  0  0  psychology
6  6   9  2  1  history
. . .
```
```3. Bernoulli NB
Predict a person's political party from votes on six motions

yes  no   yes  yes  no   no   democrat
yes  yes  yes  no   yes  yes  republican
no   no   yes  no   no   no   republican
. . .
```

Note: Notice that Bernoulli NB is really just a special case of Categorical NB.

```4. Gaussian NB
Predict a student's happiness from age, height, GPA

0.21  0.72  0.300  high
0.19  0.65  0.325  low
0.20  0.70  0.297  medium
. . .
```

I remember one of the graduate classes I took at USC from Dr. Dennis Hocevar, one of my mentors. It was a statistics class that emphasized the importance of identifying what type of problem you were facing.

The “naive” in naive Bayes classification means unsophisticated because each predictor variable is analyzed independently and interactions between predictors aren’t used. I like movies where young naive characters discover their power. Left: Luke in “Star Wars” (1977). Center: Matilda in “Matilda” (1996). Right: Harry in “Harry Potter and the Sorcerer’s Stone” (2001).

This entry was posted in Machine Learning. Bookmark the permalink.