Computing PyTorch Model Accuracy is Not Trivial

I’ve been using the PyTorch neural code library since it was first released, just over three years ago Recently, I’ve been refactoring a lot of my demo programs to update them to new PyTorch features and best practices.

During model training, it’s not too difficult to compute model error. But it’s surprisingly tricky to compute model classification accuracy. Classification accuracy is just the percentage of correct predictions.

There are many different approaches for computing PyTorch model accuracy but all the techniques fall into one of two categories: analyze the model one data item at a time, or analyze the model using one batch of all the data at once.

The one-item-at-a-time approach is more flexible and allows you to investigate exactly which data items were incorrectly predicted. The all-items-at-once approach has far fewer lines of code but isn’t as flexible.

In pseudo-code, the one-item-at-a-time approach is:

loop each data item
  X = input # like [2.5, 1.5, 3.0, 4.5]
  Y = target class # like 2
  oupt = model(X)  # computed like [0.3, 0.1, 0.6]
  pc = argmax(oupt)  # predicted class
  # print X, Y, oupt, pc to see what happened
  if pc == Y
    num_correct += 1
  else
    num_wrong += 1
end-loop
return num_correct / (num_correct + num_wrong)

Translating this simple pseudo-code to working PyTorch code is difficult. Here’s an example I use when working with the well-known Iris Dataset:

def accuracy(model, dataset):
  model.eval()
  dataldr = T.utils.data.DataLoader(dataset,
    batch_size=1, shuffle=False)

  n_correct = 0; n_wrong = 0
  for (_, batch) in enumerate(dataldr):
    X = batch['predictors']
    Y = T.flatten(batch['species'])
    oupt = model(X)  # logits form
    (big_val, big_idx) = T.max(oupt, dim=1) 
    # print here if necessary
    if big_idx.item() == Y.item():
      n_correct += 1
    else:
      n_wrong += 1

  acc = (n_correct * 100.0) / (n_correct + n_wrong)
  return acc

Because using Dataset and DataLoader objects is now the standard way to process training and test data, I use a Dataset object as the input parameter. I set the mode to eval() — this is a complex topic but briefly you use train() mode when training and use eval() mode at all other times.

The DataLoader object sets batch_size=1 to iterate one data item at a time. An alternative is to just iterate through the input parameter Dataset directly, without using a DataLoader.

Notice that DataLoader returns a Dictionary object so you need to know the keys which are ‘predictors’ and ‘species’. The implication is that you can’t really write a general purpose accuracy function — you need to craft a new accuracy() function for each problem scenario.

The T.max() function is like Python argmax() but T.max() returns both the largest value and the index of the largest value. PyTorch recently (not sure when but it’s within the last few versions) added a T.argmax() function. Notice the dim argument to T.max(). Dealing with Tensor shapes and dimensions is a real nightmare when developing models.

The target class needs to be accessed by Y.item() because Y is a tensor with just one value — another weird quirk of PyTorch that drives beginners crazy.

The batch all-items-at-once version is:

def accuracy_b(model, dataset):
  model.eval()
  X = dataset[0:len(dataset)]['predictors']
  Y = T.flatten(dataset[0:len(dataset)]['species'])

  oupt = model(X)
  # (_, arg_maxs) = T.max(oupt, dim=1)
  arg_maxs = T.argmax(oupt, dim=1)  # argmax() is new
  num_correct = T.sum(Y==arg_maxs)
  acc = (num_correct * 100.0 / len(dataset))
  return acc.item()

The key concept is the statement:

num_correct = T.sum(Y==arg_maxs)

The == comparison compares all target Y values (no item() needed) with all arg_maxs values, and the T.sum() returns the count where target Y equals arg_max. Working with aggregates like this is something that’s quite difficult for many developers, including me, to get used to.


Not so accurate school signs.

Posted in PyTorch | Leave a comment

Should You Normalize and Encode Data Before Train-Test Splitting, or After Splitting?

In theory, it’s better to split neural network data into training and test datasets and then normalize and encode each dataset separately. In practice, there are advantages to normalizing and encoding all the data first, and then splitting the data. I usually normalize and encode first and then split.

Suppose you have 100 source data items where each data item represents a person:

33  male    68,000.00  sales  moderate
27  female  52,000.00  admin  liberal
41  male    77,000.00  tech   conservative
. . .

Your goal is to create a neural network to predict the political leaning (conservative, moderate, liberal) of a person based on age, sex, income, and job type. At some point in time you need to encode the categorical predictors (sex and job type), and you should normalize the numeric predictors (age and income).

Additionally, you probably want to split the 100-item source data into an 80-item set for training the neural network and a 20-item set for testing and model evaluation.

The guiding theoretical principle is that you should split the source data into training and test sets before you do anything else, then pretend the test data doesn’t exist. You use the test data only as the very last step, and then the model prediction accuracy on the test data is a rough estimate of how well the model will do on new, previously unseen data.

So, according to theory, it’s a no-brainer — split the data first and then normalize and encode the training data only (remember, the test data doesn’t exist conceptually), then train the model, and then encode and normalize the test data so that it’s compatible with the trained model (using the same train normalizing and encoding parameters such as min and max) and then finally use the test data to evaluate the trained model.

But there are advantages to normalizing and encoding first, and then splitting.

If you normalize and encode all the source data first, and then split the data, both the training and test data have additional information compared to the split-first approach because the normalization and encoding process

If you think about it very carefully, you’ll realize that the theoretically-endorsed approach of splitting first will (probably) give you a slightly better final estimate of model accuracy, but if you normalize and encode first and then split you will (probably) get a slightly better prediction model because the training data contains ever so slightly more information.

In practical terms, normalizing and encoding the source data first and then splitting is quite a bit easier than splitting first and then normalize-encode on two datasets. And if you normalize and encode first you won’t run into a situation where you can’t encode a categorical test predictor because it didn’t appear in the training data. For example, suppose after the immediate split, the training data has only three job types — sales, admin, tech. They would be encoded as (1 0 0), (0 1 0), and (0 0 1). The job-type predictor leads to three input nodes in the neural network. Now suppose the test data by sheer bad luck of the split has a job type, like exec, that wasn’t in the training data. How can you encode exec? You can’t. Note: The counterargument is that you should always ensure that test data is representative of all data so this scenario should never be allowed.

The bottom line is that whether you should normalize and encode predictors before splitting into train-test or after splitting into train-test isn’t clear-cut. In most situations the tiny theoretical advantage you get by splitting first and then normalizing and encoding isn’t worth the extra effort required. The final estimate of model accuracy is very fuzzy no matter how you split and normalize-encode.

My usual approach is to normalize and encode all source data first. I normalize using the divide-by-constant approach and I encode using the one-hot technique. Then I split the normalized and encoded data into a training set and a test set. Then I train a model using the training data. And then I evaluate the model using the test data.


Left: The 1970 Chevrolet Camaro had a split front bumper. Center: The 1963 Chevrolet Corvette had a split rear window. Right: The 1958 General Motors Firebird III concept car had a split windshield.

Posted in Machine Learning | 1 Comment

Limiting the Size of a PyTorch Dataset / DataLoader

When developing a deep neural model, you normally start by working with a relatively small subset of your data, which saves a huge amount of time. The most common way to read and use training and test data when using PyTorch is to use a Dataset object and a DataLoader object. Unfortunately, neither object has a built-in way to adjust the size of the underlying data.

One approach is to use some sort of utility to create subset files. For example, the MNIST images dataset has 60,000 training and 10,000 test images. You could use a utility program to make a 1000-item set for training and a 100-item set for testing to get your model up and running, and then a 5000-item and a 500 item set for tuning parameters, and then finally use the 60,000-item and 10,000-item datasets when you’re fully ready to train.



Top: Limiting the PyTorch Dataset / DataLoader using the limit-at-load technique. Bottom: The early-exit while training technique.


If are using PyTorch Dataset / DataLoader and you want to programmatically adjust the sizes of your underlying data, there are two realistic options. First, if your Dataset object is program-defined, as opposed to black box code written by someone else, you can limit the amount of data read into the Dataset data storage. Second, you can read all the data and then track the number of lines of data processed during training, and early-exit when you reach some limit.

I coded up two demo programs on a simple 9-itme dataset to illustrate both techniques. In both cases I restrict the data to just 6 of the 9 items. The key lines in the limit-read technique are:

class IrisDataset(T.utils.data.Dataset):
  def __init__(self, src_file, root_dir=None,
    num_rows=None, transform=None):
    self.data = np.loadtxt(src_file, usecols=range(0,5), 
      max_rows=num_rows, delimiter=",",
     skiprows=0, dtype=np.float32)
    . . . etc.

  iris_ds = IrisDataset(".\\Data\\iris_subset_mod.txt",
    num_rows=6)  # read 6

  train_ldr = T.utils.data.DataLoader(iris_ds, batch_size=2,
    shuffle=False, drop_last=False)  # load 6

  for epoch in range(0,2):  # 2 epochs
    print("epoch = " + str(epoch))
    for (batch_idx, batch) in enumerate(train_ldr): 
      print("  bat idx = " + str(batch_idx))
      . . . etc.

The key lines of the early-exit technique are:

  iris_ds = 
    IrisDataset(".\\Data\\iris_subset_mod.txt") # read all

  train_ldr = T.utils.data.DataLoader(iris_ds, batch_size=2,
    shuffle=False, drop_last=False)  # load all

  for epoch in range(0,2):  # 2 epochs
    print("epoch = " + str(epoch))
    num_lines_read = 0
    for (batch_idx, batch) in enumerate(train_ldr): 
      if num_lines_read == 6: break  # early exit
      num_lines_read += 2  # batch size
      . . . etc.

There are some fundamental differences between the two techniques. The limit-at-load technique fetches the first num_rows of data and uses only those rows. The early-exit technique fetches all rows but then on each epoch uses a subset of all data (if the shuffle=True argument is set in the DataLoader object).

So, there’s no big moral to this technical story except that perhaps working with PyTorch, or any other deep neural library, requires a lot, repeat a lot, of patience and attention to detail.


Shrinking the size of a PyTorch Dataset is easier than shrinking people. Left: “The Incredible Shrinking Man” (1957). Center: “Dr. Cyclops” (1940). Right: “Attack of the Puppet People” (1958).

Posted in PyTorch | Leave a comment

Installing PyTorch 1.5 for CPU on Windows 10 with Anaconda 2020.02 for Python 3.7

PyTorch is a deep neural code library that you can access using the Python programming language. Anaconda is a collection of software packages that contains a base Python engine plus over 500 compatible Python packages.

Prerequisites: A machine with a relatively modern CPU (no older than 8 years old). You must be logged on as a user with full administrative privileges and be connected to the Internet.


1. Open the Windows Control Panel and delete all Python instances. You may have several different versions of Python installed. Note: In theory you can use PyTorch with multiple versions of Python on your machine, but you need to have expert skill to prevent software collisions and incompatibilities.


2. At this point you have no Python and no Python packages on your machine.


3. Do an Internet search for “anaconda archive” to find old versions of Anaconda. The URL when I wrote this post was https://repo.anaconda.com/archive/ but it could change. Locate the installer file for Windows, 64-bit, Anaconda version 2020.02 — be careful, it’s very easy to get the wrong file. It’s Anaconda3-2020.02-Windows-x86_64.exe. Double-click on it to execute directly, or right-click and download it to your local machine to any convenient directory. Then, after downloading, double-click on the executable installer to start installation.


4. You will see a Welcome screen. Double-check you have the correct version. Click “Next”.


5. You will see a License Agreement. Click “I Agree”.


6. You will see Installation Type. Accept the default “Just Me”.


7. You will see Installation Location. Accept the default. This will vary depending on whether you are logged on as a network user or a local user. The location is usually C:\Users\(user)\AppData\Local\Continuum\anaconda3 or C:\Users\(user)\Anaconda3. You should write this location down because everything goes here.


8. On the next screen you should check the box labelled “Add Anaconda to my PATH environment variable” in spite of the red warning message. Then click the “Install” button.


9. The installer will unzip thousands of files and then install them. The process takes 5-20 minutes. If you want, you can click on the “Details” button or you can just watch the green progress bar.


10. Eventually you will see an Installation Complete screen. Click “Next”.


11. You will see a screen that presents some marketing information. Click the “Next” button.


12. You will see a screen that has two check-boxes for more information. Uncheck both options and click the “Finish” button.


13. Python should now be installed. To test, open a command shell. Navigate to your root directory by typing “cd \” and hitting (Enter). Next type “python” (without the quotes). You should see the Python version 3.7.6 and the double-greater-than interactive prompt. Type “exit()” and hit (Enter).


14. Now you must find the .whl installation file for PyTorch 1.5 CPU on Python 3.7 for Windows. There are two main places where you might find it: pypi.org and pytorch.org.

Do an Internet search for “pytorch 1.5 cpu windows”. You may need to search a bit. I eventually found the .whl file at:

https://download.pytorch.org/whl/cpu/torch_stable.html

When you find the page, look for file:

torch-1.5.0%2Bcpu-cp37-cp37m-win_amd64.whl

Right-click and download the .whl file to your local machine. I suggest creating a C:\PyTorch\Wheels directory and saving there.


15. Launch a command shell and navigate to the directory where you saved the .whl file. Enter the command:

pip install “torch-1.5.0%2Bcpu-cp37-cp37m-win_amd64.whl”

Installation of PyTorch is relatively quick. The shell will display a message indicating successful installation. To verify PyTorch, enter the following commands (note: there are two consecutive underscores in the version command).

python
import torch as T
T.__version__


If you succeeded, congratulations! You can only learn PyTorch by running and experimenting with programs; now you’re ready.



Note: To uninstall just PyTorch but nothing else, launch a command shell and enter “pip uninstall torch”. You’ll get asked for confirmation.

To uninstall everything, go to the Windows Control panel | Programs and Features | Uninstall and then uninstall Python-Anaconda. Make sure you know what you’re doing.

Note: Many PyTorch examples on the Internet work with image data, such as the MNIST hand-written digits dataset. To work with images you’ll need the “torchvision” code library add-on. You’ll likely find a .whl file for torchvision version 0.6 on the same Web page as the PyTorch .whl file. It’s named:

torchvision-0.6.0%2Bcpu-cp37-cp37m-win_amd64.whl

You can right-click on the .whl file and download it to your local machine, and then install with the command:

pip install “torchvision-0.6.0%2Bcpu-cp37-cp37m-win_amd64.whl”


Posted in PyTorch | Leave a comment

My Ten Favorite Science Fiction Films of the 1950s

I grew up watching 1950s science fiction movies. To be honest some of them haven’t held up too well over time but many of them are quite good and I have a soft spot in my heart for all of these films. Here is a list of my top 10 science fiction films of the 1950s, meaning, if I was going on a trip for three months and could only take ten sci-fi films from the 50s, these would be the ten.

I first published this list in 2012. I revisited it eight years later in 2020 and I still pretty much agree with my original thoughts.


1. Invaders from Mars (1953) – A young boy thinks he sees a flying saucer land during a storm at night. Soon, people, including his parents, start acting strangely. This movie still scares me. Some of the best music of any science fiction movie ever. Do not waste your time on the horrible 1986 remake.


2. Forbidden Planet (1956) – Fantastic special effects, innovative music, and Robby the Robot highlight a story where Leslie Nielson captains the C-57D to find out what happened to the colony on Altar IV.


3. Gog (1954) – Richard Egan stars as an investigator sent to a super-secret underground desert laboratory complex to solve a series of bizarre deaths. I love the robot Gog – what scientific research robot is complete without crushing claws and a flamethrower?


4. Quatermass 2 (1957) – A British film sometimes called “Enemy from Space” in the U.S. A somewhat crusty Brian Donlevy plays Dr. Quatermass (not Quartermass) as he investigates reports of strange meteorites. He ends up at a creepy industrial plant. Is this an alien invasion or just paranoia?


5. The War of the Worlds (1953) – A George Pal production with Oscar-winning special effects. Gene Barry desperately tries to find a way to stop an unstoppable Martian invasion. I love Sir Cedric Hardwicke’s introductory narration from the H.G. Wells book. I did not like the 2005 Spielberg remake.


6. Godzilla (1956) – Although later movies featuring Godzilla became cartoonish, the original 1954 Japanese version and the 1956 American-ized version are deadly serious. Raymond Burr watches the destruction of Tokyo from an ill-advised location on top of a tall antenna tower. The early scene on the island, when the scientists are hiking up the steep hill and Godzilla appears, gave me nightmares for years.


7. The Thing from Another World (1951) – Usually just called The Thing, this movie has the classic scenario of a group of people isolated (in this case at a polar research station) and menaced by an alien. Excellent acting and intelligent dialog set this movie apart. I prefer this version to the good 1982 remake.


8. Them! (1954) – The predecessor of all giant bug films has policeman James Whitmore and professor Edmund Gwenn discovering unexpected consequences (giant man-eating ants) of atomic testing in the desert. I like the suspense and the fact that the ants aren’t seen until well into the movie.


9. 20,000 Leagues Under the Sea (1954) – The Disney film is really more adventure than science fiction. I was fascinated by the Nautilus submarine and as a young man loved the 1960s exhibit featuring it and sets from the movie in a display on Main Street of the Anaheim Disneyland (where I ended up working many years later while going to school at UC Irvine).


10. It! The Terror from Beyond Space (1958) – Very tense film in which a crew lands on Mars to investigate the disappearance of all but one member of a previous expedition. On the return to earth they discover that they have a very unfriendly stowaway. This movie was a direct inspiration for the 1979 film “Alien”.



Honorable Mention – There are many films that didn’t quite make it into my top 10 list. The Atomic Submarine (1959) – Great scene in the alien saucer in total darkness, and innovative electronic music effects. The Trollenberg Terror (1958) – Known as The Crawling Eye in the U.S., Forrest Tucker is menaced by, well, giant crawling eyeballs in creepy fog. Attack of the Crab Monsters (1957) – Cheap but effective Roger Corman production has people trapped on an island. Fiend without a Face (1958) – Canadian production with very cool crawling brain creatures. When Worlds Collide (1951) – Earth must be evacuated before it’s too late. The Beast from 20,000 Fathoms (1953) – Nice Ray Harryhausen stop-action effects when the beast meets its end on a roller coaster. It Came from Beneath the Sea (1955) – More Ray Harryhausen effects featuring a giant octopus in San Francisco. This Island Earth (1955) – Earth scientists try to help the planet Metaluna against the Zagons. Earth vs. the Flying Saucers (1956) – The title says it all. Kronos (1957) – Earth is menaced by an enormous energy-collecting machine. The Monolith Monsters (1957) – The monsters are huge crystalline structures. The Man from Planet X (1951) – Some very scary scenes when people approach the spacecraft. Rodan (1956) – I like the early scenes in the mine before the appearance of the two flying dinosaurs.

Posted in Top Ten | Leave a comment

Correlation and Causation – Cities, Race, Crime

Correlation can indicate possible causation but correlation doesn’t prove causation. A common example that often appears in media is the relationship between the percentage of Black residents in a city and the violent crime rate. There’s a very strong statistical correlation between the race and crime variables.

The graph below plots data for 14 large U.S. cities. The Pearson R-squared is 0.82 — very strong. But this doesn’t necessarily mean that the minority-ness of a city causes violent crime. It’s just correlation. The only thing you can say with some confidence is that the violent crime rate in cities with large percentages of minority residents is much higher than in cities with low percentages. Other potential causes of high violent crime rates include percentage of children born out of wedlock, absence of fathers in the family unit, embedded culture, low education level, and so on. Statistics can suggest causes but only controlled experiments can prove causation.


The x-axis is the percentage of Black residents in a city. The y-axis is the violent crime rate per 100,000 residents. The raw data comes from FBI crime statistics.


In most cases, how statistics are used is up to human interpreters. One of the differences between classical statistics and machine learning is that machine learning is usually more predictive than classical statistics. For example, suppose you want to place a bet on one of two sports teams. Classical statistics might look at R-squared correlations, graphs, and all kinds of tables, and then you could use the data, combined with human intuition, to pick one of the two teams to bet on.

A machine learning approach might consist of a deep neural network that ultimately outputs a team to bet on, perhaps with an estimated probability that the selected team will win.

Of course, the difference between classical statistics and machine learning isn’t clear cut. There’s a lot of overlap between the two, and it’s a really matter of perspective.


Three clever photos using forced perspective.

Posted in Machine Learning | Leave a comment

A Minimal PyTorch Complete Example

I have taught quite a few workshops on the PyTorch neural network library. Learning PyTorch (or any other neural code library) is very difficult and time consuming. If beginners start without knowledge of some fundamental concepts, they’ll be overwhelmed quickly. But if beginners spend too much time on fundamental concepts before ever seeing a working neural network, they’ll get bored and frustrated. Put another way, even an experienced developer shouldn’t start with a PyTorch LSTM network, and on the other hand, he shouldn’t start with four weeks of learning about low-level details of Tensor objects.

To deal with this learning difficulty issue I created what I consider to be a minimal, reasonable, complete PyTorch example. I targeted the recently released version 1.5 of PyTorch, which I expect to be the first significantly stable version (meaning very few bugs and no version 1.6 for at least six months).

The idea is to learn in a spiral fashion, getting an example up and running, and then gradually expanding the features and concepts. My minimal example hard-codes the training data, doesn’t use any test data, uses online rather than batch processing, doesn’t explicitly initialize the weights and biases, doesn’t monitor error during training, doesn’t evaluate model accuracy after training, and doesn’t save the trained model. Even so, my minimal example is nearly 100 lines of code.

Some of my colleagues might use the PyTorch Sequential() class rather than the Module() class to define a minimal neural network, but in my opinion Sequential() is far too limited to be of any use, even for simple neural networks.

The training data is just 6 items from the famous Iris Dataset. Each item consists of four predictor values (sepal length and width, petal length and width) and a species to predict (0 = setosa, 1 = versicolor, 2 = virginica).

Even though I’ve coded hundreds of neural networks in many different ways, I underestimated how much information is contained in even a minimal neural network. Almost every line of code requires significant explanation — up to a certain point. When I use the minimal example in a workshop, I could easily devote over 8 hours of discussion to it. But that would defeat the purpose of a minimal example.

Weirdly, I think the complexity of neural networks with PyTorch is an appealing factor in some way. It creates an intellectual challenge that appeals to a competitive personality.

Somewhat unfortunately, there’s a lot of work that has to be done in order to set up a PyTorch environment to run a minimal example. Briefly, you have to install a Python distribution (I strongly prefer and recommend Anaconda), and then install PyTorch (and usually TorchVision if you work with image data). It doesn’t seem like much, but there’s a lot that can go wrong.


Interestingly, research has shown that men are significantly more competitive than women on average, and that men and women compete differently. See the Harvard Business Review article at hbr.org/2019/11/research-how-men-and-women-view-competition-differently. Left: Physical competition — the current U.S. record holder in the javelin throw, Breaux Greer (since 2007). Center: Intellectual competition — the current world chess champion, Magnus Carlsen (since 2016). Right: Personal appearance is a form of competition (since approximately 3000 BC).


# iris_minimal.py
# PyTorch 1.5.0-CPU Anaconda3-2020.02  Python 3.7.6
# Windows 10 

import numpy as np
import torch as T
device = T.device("cpu")  # apply to Tensor or Module

# -----------------------------------------------------------

class Net(T.nn.Module):
  def __init__(self):
    super(Net, self).__init__()
    self.hid1 = T.nn.Linear(4, 7)  # 4-7-3
    self.oupt = T.nn.Linear(7, 3)
    # (initialize weights)

  def forward(self, x):
    z = T.tanh(self.hid1(x))
    z = self.oupt(z)  # no softmax. see CrossEntropyLoss() 
    return z

# -----------------------------------------------------------

def main():
  # 0. get started
  print("\nBegin minimal PyTorch Iris demo ")
  T.manual_seed(1)
  np.random.seed(1)
  
  # 1. set up training data
  print("\nLoading Iris train data ")

  train_x = np.array([
    [5.0, 3.5, 1.3, 0.3],
    [4.5, 2.3, 1.3, 0.3],
    [5.5, 2.6, 4.4, 1.2],
    [6.1, 3.0, 4.6, 1.4],
    [6.7, 3.1, 5.6, 2.4],
    [6.9, 3.1, 5.1, 2.3]], dtype=np.float32) 

  train_y = np.array([0, 0, 1, 1, 2, 2], dtype=np.long)

  print("\nTraining predictors:")
  print(train_x)
  print("\nTraining class labels: ")
  print(train_y)

  train_x = T.tensor(train_x, dtype=T.float32).to(device)
  train_y = T.tensor(train_y, dtype=T.long).to(device)

  # 2. create network
  net = Net().to(device)    # could use Sequential()

  # 3. train model
  max_epochs = 100
  lrn_rate = 0.04
  loss_func = T.nn.CrossEntropyLoss()  # applies softmax()
  optimizer = T.optim.SGD(net.parameters(), lr=lrn_rate)

  print("\nStarting training ")
  net.train()
  indices = np.arange(6)
  for epoch in range(0, max_epochs):
    np.random.shuffle(indices)
    for i in indices:
      X = train_x[i].reshape(1,4)  # device inherited
      Y = train_y[i].reshape(1,)
      optimizer.zero_grad()
      oupt = net(X)
      loss_obj = loss_func(oupt, Y)
      loss_obj.backward()
      optimizer.step()
    # (monitor error)
  print("Done training ")

  # 4. (evaluate model accuracy)

  # 5. use model to make a prediction
  net.eval()
  print("\nPredicting species for [5.8, 2.8, 4.5, 1.3]: ")
  unk = np.array([[5.8, 2.8, 4.5, 1.3]], dtype=np.float32)
  unk = T.tensor(unk, dtype=T.float32).to(device) 
  logits = net(unk).to(device)
  probs = T.softmax(logits, dim=1)
  probs = probs.detach().numpy()  # allows printoptions

  np.set_printoptions(precision=4)
  print(probs)

  # 6. (save model)

  print("\nEnd Iris demo")

if __name__ == "__main__":
  main()
Posted in PyTorch | Leave a comment