Deterministic ML Experiments: Seeding More Than Just NumPy
Setting one random seed is not enough for reproducibility. Different libraries and execution backends each have their own randomness controls.
Step 1: seed Python, NumPy, and framework runtime together
import os, random, numpy as np
SEED = 20260311
os.environ["PYTHONHASHSEED"] = str(SEED)
random.seed(SEED)
np.random.seed(SEED)
Step 2: configure deterministic backend options
import torch
torch.manual_seed(SEED)
torch.use_deterministic_algorithms(True)
torch.backends.cudnn.benchmark = False
Step 3: log seed and environment fingerprint with metrics
meta = {
"seed": SEED,
"python": platform.python_version(),
"torch": torch.__version__,
}
Pitfall
Comparing runs with different library versions while assuming seed parity guarantees identical results.
Verification
- Two runs on same environment produce matching metrics within tolerance.
- Experiment metadata includes seed and package versions.
- Nondeterministic kernels are disabled where required.