PyTorch Tanh and a Downside to Open Source Software

I try to write a little bit of code each day. Writing code is a skill that can only be learned by practice, and furthermore, if you don’t practice you will lose your existing skill. I don’t speak any foreign language fluently, but I suspect the same use-it-or-lose-it factor is true there too.

The PyTorch neural network code library is very complex. But the fact that PyTorch is open source makes learning PyTorch even more difficult than it should be. I’ll explain using the tanh function as an example.

The tanh function (hyperbolic tangent) is the most common function used for hidden layer activation for shallow neural networks. The PyTorch library has at least four different ways to use tanh. This is not good and contributes to confusion. Why are there so many ways to call tanh? How are they different? When should each be used?

Here’s the simplest way to call tanh:

import torch as T
import numpy as np

def forward(self, x):
  z = self.hid1(x)
  z = T.tanh(z)  # 1. ordinary function
  z = T.nn.Softmax(0)(self.oupt(z))
  return z

When PyTorch was first released, this was the way to call tanh — simple and effective. But with open source, there’s no penalty to the maintainers of the library when they make arbitrary changes. So they make many unnecessary changes.

So at some point the preferred technique became:

  z = T.nn.functional.tanh(z) 

But then the preferred technique changed again to a class:

  z = T.nn.Tanh()(z)

On top of this, some of the very early documentation used a low-level approach:

  for j in range(4):
    z[j] = np.tanh(z[j].item())

This has lead to documentation and examples that differ significantly even on something as fundamental as a simple trigonometry function.

In code libraries that are maintained and curated by a company or central authority, there are releases only about once or twice a year, and the releases are usually well thought out. Open source software spews releases continuously which sometimes leads to bad decisions, poorly thought out design, many revisions, and mountains of irrelevant documentation.



Left: A boating bad decision. Center: A girl like this trying to ride a motorcycle like that — a bad decision that did not end well a few seconds after the photo was taken. Right: Amazon founder Jeff Bezos has set an epic standard for Bad Decision.

This entry was posted in PyTorch. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s