Getting Reproducible Results When Using Keras

When using Keras (or similar neural network libraries such as CNTK and PyTorch) it can be surprisingly difficult to get reproducible results. Many of the Keras functions have a random component that do not have a default seed value. Here are some tips to get reproducible results.

1. Make sure you explicitly initialize the weights and biases in every layer. For example:

print("Creating 4-(8-8)-3 NN classifier \n")
my_init = K.initializers.glorot_uniform(seed=1)
model = K.models.Sequential()
model.add(K.layers.Dense(units=8, input_dim=4,
  activation='tanh', kernel_initializer=my_init)) 
model.add(K.layers.Dense(units=8, activation='tanh',
  kernel_initializer=my_init)) 
model.add(K.layers.Dense(units=3, activation='softmax',
  kernel_initializer=my_init))

The kernel_initializer parameter sets initial weights. According to the documentation, biases are initialized to zero so there’s no need to explicitly initialize them, but you could just to be safe. Note that you can use the Python “with” statement as a shortcut to apply the same initializer to all functions in the with-block.

2. Be sure to set the NumPy global random seed and also consider setting the global Python random seed and the global TensorFlow random seed:

# my_program.py
import keras as K
import numpy as np
np.random.seed(1) # NumPy
import random
random.seed(2) # Python
from tensorflow import set_random_seed
set_random_seed(3) # Tensorflow

def main():
 print("Hello")
 . . . 

3. Consider setting the PYTHONHASHSEED environment variable before running program:

C:\MyScripts> set PYTHONHASHSEED=1
C:\MyScripts>
C:\MyScripts>python my_program.py
. . .

If you still aren’t getting reproducible results, then tracking down the issue can be very difficult. You have to walk through each function call, pull up the documentation and see if the function signature has a seed parameter. Unfortunately, many functions call other functions, and some functions/methods inherit from a base class, so diagnosing can be really difficult.

That said however, the three tips above handle most situations.



Ana De Armas and Ryan Gosling in “Blade Runner 2047”. A replica human and a replicant human.

Advertisements
This entry was posted in Keras, Machine Learning. Bookmark the permalink.