Defending Machine Learning Image Classification Models from Attacks

I wrote an article titled “Defending Machine Learning Image Classification Models from Attacks” in the January 2021 edition of the Pure AI web site. See https://pureai.com/articles/2021/01/05/defending-model-attacks.aspx.

In the article, I describe an interesting research paper “Denoised Smoothing: A Provable Defense for Pretrained Classifiers” that shows a simple but effective way to prevent adversarial attacks on a machine learning image classification model. Image classification models can b attacked by cleverly modifying a source image. The modified image appears unchanged to the human eye, but the image classifier will grotesquely misclassify the image. The classic example comes from a 2014 paper where researchers modified a photo of a school bus and tricked the image classifier into thinking the photo was an ostrich.

The technique presented in the Denoised Smoothing paper works as indicated in this diagram:

The source image is fed into the system. Several copies of the source image are made. Each image has noise added by modifying pixel values. Then the noisy copies are denoised using a special type of neural network autoencoder. The denoised copies are fed to a standard image classification system, and a consensus classification is reached.

The noise that is deliberately added to the source image swamps any adversarial perturbation noise that has been added by a bad person.

An interesting and practical technique for defending image classification models from attack.



Three book covers by artist Vincent Di Fate (b. 1945). Di Fate is considered one of the top science fiction artists and he has won many industry awards. I don’t know how I’d classify his style.

This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s