Training a Generative Adversarial Network (GAN)

A generative adversarial network (GAN) is a complex deep neural system that can be used to generate fake data based on a set of real data. This can be useful in several scenarios, including generating additional training data for a neural classifier when you have limited training data for one of the classes you’re trying to predict. GANs are most often used for image data but GANs can be used with any kind of data.

I understand GANs quite well now. But as with any kind of software system, I didn’t understand GANs until I implemented one in code and dissected every line of code. Here’s a screenshot of a GAN I created that generates fake ‘3’ UCI Digits (crude 8×8 grayscale images of a handwritten ‘3’):

While I was learning how to create and use GANs, I searched the Internet looking for information. I found many diagrams. However, in retrospect, it is clear that almost all of the information I found about GANs was posted by developers who really didn’t understand GANs at all. Post after post was basically a highly simplistic diagram that grotesquely understated how complex GANs are. Here’s an example of one of the diagrams I found on the Internet:

Now don’t get me wrong — the simple diagram is very good in the sense that it gives someone a general idea of how GANs work. But the diagram is so over-simplified it’s misleading for anyone who wants to actually implement a GAN, especially the training code. I figured I could create a more detailed diagram of how to train a GAN, so I fired up PowerPoint and created a diagram that shows one training iteration:

The first observation is that training a GAN is complicated. The second indirect observation is that, because of the complexity, there are a huge number of hyperparameters and design decisions. This makes GANs very difficult to work with. But like most things in computer science, once you understand GANs (after many hours of exploration), they’re really quite easy and fascinating.

Diagrams for computer science topics can have different levels of detail. Diagrams without much detail are good to understand general ideas, and diagrams with a lot of detail are better for understanding implementation.

When I get a chance, I’ll tidy up my demo code then write up an explanation, and then post the information here on my blog or on Visual Studio Magazine.

Street scenes of Victorian England by two of my favorite artists. The paintings have different levels of detail but that’s what art is all about. Left: A very detailed “Foregate Street, Chester” by Louise Rayner (1832-1924). Chester is a walled city in England. Right: A more abstract “London Street at Night” by John Atkinson Grimshaw (1836-1893). Grimshaw was one of the first artists to master city night scenes after the invention of electric lights made such scenes possible.

This entry was posted in PyTorch. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s