Explaining the PyTorch EmbeddingBag Layer

I came across a PyTorch documentation example that used an EmbeddingBag layer. I dissected the example to figure out exactly what an EmbeddingBag layer is and how it works. The bottom line is that an EmbeddingBag layer is useful for relatively simple natural language classification tasks, when the input sentence(s) are short and you can use a basic neural network rather than a complex recurrent neural network such as an LSTM or a Transformer.

The diagram below shows how a standard Embedding layer works, and how an EmbeddingBag works. A regular Embedding layer creates a vector of values (the number of values is the embed_dim) for each word. When you batch items together for a sequential type NN such as am LSTM or Transformer, you must make all items the same length and so you must pad short sentences. This is a real pain.



Click to enlarge.


In the diagram, the sentence “Men write code” is converted to [(0.312, 0.882), (0.321, 0.385), (0.543, 0.481), (0.203, 0.404)]. The embed_dim = 2, and the word “men” is represented by (0.312, 0.882). The trailing padding token is encoded as (0.203, 0.404). In an LSTM or GRU, the embedded words are sent sequentially. In a Transformer, all the embedded words are sent at once and they are sequentially processed (in a very complex way) internally.

With an EmbeddingBag, you don’t need padding. You connect the sentences together into an input batch and record where each sentence starts in an offsets array. Instead of each word being represented by an embedding vector, with an EmbeggingBag, each sentence is represented by an embedding vector. This simplification loses the sequential information so you use a simple neural network.

import torch as T

class TextClassificationModel(T.nn.Module):

  def __init__(self, vocab_size, embed_dim, num_class):
    super(TextClassificationModel, self).__init__()
    self.embedding = T.nn.EmbeddingBag(vocab_size, 
      embed_dim, sparse=True)
    self.fc = T.nn.Linear(embed_dim, num_class)
    self.init_weights()

  def init_weights(self):
    lim = 0.5
    self.embedding.weight.data.uniform_(-lim, lim)
    self.fc.weight.data.uniform_(-lim, lim)
    self.fc.bias.data.zero_()

  def forward(self, text, offsets):
    embedded = self.embedding(text, offsets)
    oupt = self.fc(embedded)
    return oupt

The EmbeddingBag with a standard neural network approach is much simpler than the Embedding layer with a LSTM or Transformer. The EmbeddingBag approach is often viable for situations where the input sequences are just one or two sentences.



There’s a lot of fascinating economics and psychology factors related to designer bags that are embedded into some women’s (and a few men too I suppose) consciousnesses. Left: Louis Vuitton City Steamer MM bag ($55,000). Center: Chanel Alligator Classic bag ($18,000). Right: Hermes Birkin White Niloticus Himalaya Crocodile bag ($385,000).

This entry was posted in PyTorch. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s