A variational autonencoder (VAE) is a deep neural system that can generate synthetic data items. One possible use of a VAE is to generate synthetic minority class items (those with very few instances) for an imbalanced dataset. At least in theory — I’v never seen it done in practice.
So, I decided to code up a little experiment. I used the PyTorch neural library — my curret library of choice — but the same ideas can be implemented using TensorFlow or Keras. I started with the UCI Digits dataset. It has 3,823 training images. Each image is a crude handwritten digit from ‘0’ to ‘9’, represented by 8 by 8 (64) pixels where each pixel is a grayscale value between 0 and 16. There are about 380 of each image. (Annoyingly, the dataset isn’t exactly evenly distributed). I filtered out the 389 ‘1’ digits into a source data file. Then I trained a VAE on the ‘1’ digits. Then I used the trained VAE to generate five synthetic ‘1’ images. I was satisfied with the results.
It did take a bit of experimentation to tune the architecture of the VAE. My final architecture was 64-32-[4,4]-4-32-64. The interior [4,4] means that each digit is condensed to a distribution mean vector with four values and a distribution standard deviation (in the form of log-variance) vector with four values. VAEs are like riding a bicycle — easy once you know how but rather complicated when you don’t.
The tuning of the architecture determines the fidelity of the synthetic generated images. If the VAE is too good, it will memorize the training data. If the VAE is weak, the synthetic data will be too dissimilar to the source training data.
For my demo, I generated five synthetic ‘1’ digits. They seemed good — they all look like real UCI Digits ‘1’ images but they weren’t simple copies.
When I get some free time, I’ll tidy up my demo code, add some explanation, and publish it all in Visual Studio Magazine.
Starting with the mixed media portrait on the left, I did repeated Internet image searches for similar images to manually generate two synthetic versions of the original image. The starting image on the left is by artist Hans Jochem Bakker. The center image is by artist Daniel Arrhakis. The image on the right is by artist Randy Monteith.
I’m not sure how well my synthetic image generation idea worked, but I like all three portraits anyway. Finding beauty is always a good use of time.