An Example of Sensitivity Analysis for a PyTorch Model

In sensitivity analysis, you examine the effects of changing input values to a machine learning prediction model. The classic example is looking at a model that predicts the credit worthiness of a loan applicant based on things like income, debt, age, and so on. If the model predicts 0 = decline loan, you might want to examine the effect of the debt predictor variable to see at what point the prediction changes to 1 = approve loan. You’d typically want to know the smallest amount of reduction in debt that would generate an approve-loan result.

Before I go any further, let me point out that sensitivity analysis has a serious drawback. I explain it below.

Sensitivity analysis is closely related to, and in fact is pretty much the same as, what-if analysis. For example, “What if the income value is increased by $1,000?” And sensitivity analysis is a form of model interpretability — understanding how a model works.

I implemented a demo using PyTorch. The demo model predicts a person’s political leaning (conservative, moderate, liberal) based on sex (M, F), age, state (Michigan, Nebraska, Oklahoma), and annual income. After training, I set up an input of Male, 30 years old, Oklahoma, $50,000. The predicted political leaning in pseudo-probabilities is [[0.6905 0.3049 0.0047]], which is class 0 (conservative).

Then I varied the age input value from 0.00 to 0.75 and examined the results:

Age       Pseudo-Probabilities    Predicted Politics
--------------------------------------------------------
0.00  |  [[0.9956 0.0044 0.    ]]  |  0
0.05  |  [[0.9928 0.0072 0.    ]]  |  0
0.10  |  [[0.9868 0.0132 0.    ]]  |  0
0.15  |  [[0.9728 0.0272 0.0001]]  |  0
0.20  |  [[0.9381 0.0616 0.0002]]  |  0
0.25  |  [[0.8552 0.1438 0.001 ]]  |  0
0.30  |  [[0.6905 0.3049 0.0047]]  |  0
0.35  |  [[0.4666 0.5145 0.0189]]  |  1
0.40  |  [[0.2732 0.6661 0.0607]]  |  1
0.45  |  [[0.1517 0.6913 0.157 ]]  |  1
0.50  |  [[0.0831 0.5905 0.3263]]  |  1
0.55  |  [[0.0445 0.4168 0.5387]]  |  2
0.60  |  [[0.0234 0.2527 0.7239]]  |  2
0.65  |  [[0.0126 0.1427 0.8447]]  |  2
0.70  |  [[0.0072 0.0813 0.9115]]  |  2
0.75  |  [[0.0045 0.0491 0.9464]]  |  2

The sensitivity analysis shows that age = 30 is a somewhat of a critical value because at age 35 through age 50, the predicted politics leaning switches from conservative to moderate. The analysis shows there’s somewhat of a linear relationship between age and political leaning — but that relationship is only for Male, Oklahoma, $50,000 — it may not hold for other combinations of input values.

Note: If you made a graph of this data with values of age on the x-axis, the result is called an Individual Conditional Expectation (ICE) Plot. You are examining the effect of changing age for a specific individual data item — Male, 30 years old, Oklahoma, $50,000.

Based on these results, a logical next step would be to examine age values between 0.30 (30 years old) and 0.39 at a more granular level. For example:

0.30  |  [[0.6905 0.3049 0.0047]]  |  0
0.31  |  [[0.6479 0.3458 0.0063]]  |  0
0.32  |  [[0.6034 0.3882 0.0084]]  |  0
0.33  |  [[0.5578 0.4312 0.0111]]  |  0
0.34  |  [[0.5119 0.4736 0.0145]]  |  0
0.35  |  [[0.4666 0.5145 0.0189]]  |  1
0.36  |  [[0.4229 0.5529 0.0243]]  |  1
0.37  |  [[0.3812 0.5878 0.0309]]  |  1
0.38  |  [[0.3422 0.6187 0.0391]]  |  1
0.39  |  [[0.3062 0.6449 0.0489]]  |  1

The data suggests that ages 34-35 are some kind of boundary.

There aren’t any really good general purpose sensitivity analysis tools because how you perform an analysis is highly problem-dependent.

A serious drawback of sensitivity analysis of one input variable is that it doesn’t take into account interaction effects with other input variables. For example, in the analysis above, the combined effects of age and gender could be completely different than the effect of age by itself. Exploring all combinations of input variables isn’t practical in most cases. This drawback is so serious that I rarely use sensitivity analysis because the results could be very misleading.

A technique called Shapley Value examines the effect of changing all possible combinations of input variables. In other words, Shapley Value analysis is combinatorial sensitivity analysis. See: jamesmccaffrey.wordpress.com/2020/11/09/example-of-the-shapley-value-for-machine-learning-interpretability/.

Note: A technique that’s closely related to sensitivity analysis is called making a Partial Dependence Plot (PDP). PDP is used most often for regression models where the prediction is a numeric value, such as predicting a person’s income. In a PDP you pick a predictor of interest, say age, then find all the possible age values in the data — perhaps (18, 19, 20, 23, 24, 29, 30, . . 56). For each possible age value you do a simulation, say 10,000 trials, where you pick random (but legal) values for the other predictors, and compute the average prediction value. PDPs don’t take feature interaction into account so PDPs are not all that great.



The word “sensitive” when applied to people means someone who has “a delicate appreciation of others’ feelings”. Here are two science fiction movies where the alien is definitely not sensitive. Left: The Martian mastermind from “Invaders from Mars” (1953), one of my favorite films of the 1950s. Right: The alien from “Alien” (1979) — the movie scared me a lot when I first saw it.


Demo code. Replace “lt”, “gt”, “lte”, “gte” with Boolean operator symbols. The data can be found at jamesmccaffrey.wordpress.com/2022/09/01/multi-class-classification-using-pytorch-1-12-1-on-windows-10-11/.

# people_sensitivity.py
# predict politics type from sex, age, state, income
# PyTorch 1.12.1-CPU Anaconda3-2020.02  Python 3.7.6
# Windows 10/11 

import numpy as np
import torch as T
device = T.device('cpu')  # apply to Tensor or Module

# -----------------------------------------------------------

class PeopleDataset(T.utils.data.Dataset):
  # sex  age    state    income   politics
  # -1   0.27   0  1  0   0.7610   2
  # +1   0.19   0  0  1   0.6550   0
  # sex: -1 = male, +1 = female
  # state: michigan, nebraska, oklahoma
  # politics: conservative, moderate, liberal

  def __init__(self, src_file):
    all_xy = np.loadtxt(src_file, usecols=range(0,7),
      delimiter="\t", comments="#", dtype=np.float32)
    tmp_x = all_xy[:,0:6]   # cols [0,6) = [0,5]
    tmp_y = all_xy[:,6]     # 1-D

    self.x_data = T.tensor(tmp_x, 
      dtype=T.float32).to(device)
    self.y_data = T.tensor(tmp_y,
      dtype=T.int64).to(device)  # 1-D

  def __len__(self):
    return len(self.x_data)

  def __getitem__(self, idx):
    preds = self.x_data[idx]
    trgts = self.y_data[idx] 
    return preds, trgts  # as a Tuple

# -----------------------------------------------------------

class Net(T.nn.Module):
  def __init__(self):
    super(Net, self).__init__()
    self.hid1 = T.nn.Linear(6, 10)  # 6-(10-10)-3
    self.hid2 = T.nn.Linear(10, 10)
    self.oupt = T.nn.Linear(10, 3)

    T.nn.init.xavier_uniform_(self.hid1.weight)
    T.nn.init.zeros_(self.hid1.bias)
    T.nn.init.xavier_uniform_(self.hid2.weight)
    T.nn.init.zeros_(self.hid2.bias)
    T.nn.init.xavier_uniform_(self.oupt.weight)
    T.nn.init.zeros_(self.oupt.bias)

  def forward(self, x):
    z = T.tanh(self.hid1(x))
    z = T.tanh(self.hid2(z))
    z = T.log_softmax(self.oupt(z), dim=1)  # NLLLoss() 
    return z

# -----------------------------------------------------------

def accuracy(model, ds):
  # assumes model.eval()
  # item-by-item version
  n_correct = 0; n_wrong = 0
  for i in range(len(ds)):
    X = ds[i][0].reshape(1,-1)  # make it a batch
    Y = ds[i][1].reshape(1)  # 0 1 or 2, 1D
    with T.no_grad():
      oupt = model(X)  # logits form

    big_idx = T.argmax(oupt)  # 0 or 1 or 2
    if big_idx == Y:
      n_correct += 1
    else:
      n_wrong += 1

  acc = (n_correct * 1.0) / (n_correct + n_wrong)
  return acc

# -----------------------------------------------------------

def main():
  # 0. get started
  print("\nBegin People predict politics sensitivity ")
  T.manual_seed(1)
  np.random.seed(1)
  
  # 1. create DataLoader objects
  print("\nCreating People Datasets ")

  train_file = ".\\Data\\people_train.txt"
  train_ds = PeopleDataset(train_file)  # 200 rows

  test_file = ".\\Data\\people_test.txt"
  test_ds = PeopleDataset(test_file)    # 40 rows

  bat_size = 10
  train_ldr = T.utils.data.DataLoader(train_ds,
    batch_size=bat_size, shuffle=True)

# -----------------------------------------------------------

  # 2. create network
  print("\nCreating 6-(10-10)-3 neural network ")
  net = Net().to(device)
  net.train()

# -----------------------------------------------------------

  # 3. train model
  max_epochs = 1000
  ep_log_interval = 200
  lrn_rate = 0.01

  loss_func = T.nn.NLLLoss()  # assumes log_softmax()
  optimizer = T.optim.SGD(net.parameters(), lr=lrn_rate)

  print("\nbat_size = %3d " % bat_size)
  print("loss = " + str(loss_func))
  print("optimizer = SGD")
  print("max_epochs = %3d " % max_epochs)
  print("lrn_rate = %0.3f " % lrn_rate)

  print("\nStarting training")
  for epoch in range(0, max_epochs):
    # T.manual_seed(epoch+1)  # checkpoint reproducibility
    epoch_loss = 0  # for one full epoch

    for (batch_idx, batch) in enumerate(train_ldr):
      X = batch[0]  # inputs
      Y = batch[1]  # correct class/label/politics

      optimizer.zero_grad()
      oupt = net(X)
      loss_val = loss_func(oupt, Y)  # a tensor
      epoch_loss += loss_val.item()  # accumulate
      loss_val.backward()
      optimizer.step()

    if epoch % ep_log_interval == 0:
      print("epoch = %5d  |  loss = %10.4f" % \
        (epoch, epoch_loss))

  print("Training done ")

# -----------------------------------------------------------

  # 4. evaluate model accuracy
  print("\nComputing model accuracy")
  net.eval()
  acc_train = accuracy(net, train_ds)  # item-by-item
  print("Accuracy on training data = %0.4f" % acc_train)
  acc_test = accuracy(net, test_ds) 
  print("Accuracy on test data = %0.4f" % acc_test)

# -----------------------------------------------------------

  # 5. make a prediction
  print("\nPredicting politics for M  30  oklahoma  $50,000: ")
  X = np.array([[-1, 0.30,  0,0,1,  0.5000]], dtype=np.float32)
  X = T.tensor(X, dtype=T.float32).to(device) 

  with T.no_grad():
    logits = net(X)  # do not sum to 1.0
  probs = T.exp(logits)  # sum to 1.0
  probs = probs.numpy()  # numpy vector prints better
  pred_class = np.argmax(probs)
  np.set_printoptions(precision=4, suppress=True)
  print(probs, end=""); print("  |  " + str(pred_class))

# -----------------------------------------------------------

  # 6. sensitivity analysis
  print("\nExamining effect of age on politics type \n")
  X = np.array([[-1, 0.30,  0,0,1,  0.5000]],
    dtype=np.float32)
  X = T.tensor(X, dtype=T.float32).to(device) 

  age = 0.0
  while age "lt" 0.80:
    X[0][1] = age
    with T.no_grad():
      probs = T.exp(net(X)).numpy()
    pred_class = np.argmax(probs)
    print("%4.2f  |  " % age, end ="")
    print(probs, end ="")
    print("  |  %d " % pred_class)

    age += 0.05
    
  print("\nEnd People sensitivity demo")

if __name__ == "__main__":
  main()
This entry was posted in Machine Learning, PyTorch. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s