Computer Vision Models Fail Quietly Before Training

People blame model architecture for poor vision results, but a large share of failures starts in preprocessing inconsistencies: crop policy drift, color normalization mismatch, and label noise.

Step 1: Normalize images through one deterministic pipeline

import cv2
import numpy as np

def preprocess(img, size=(224, 224)):
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = cv2.resize(img, size, interpolation=cv2.INTER_AREA)
    img = img.astype(np.float32) / 255.0
    return img

Step 2: Make label validation explicit

def validate_labels(rows):
    allowed = {"cat", "dog", "bird"}
    bad = [r for r in rows if r["label"] not in allowed]
    if bad:
        raise ValueError(f"invalid labels: {len(bad)}")

Step 3: Track dataset fingerprints

find data/train -type f -print0 | sort -z | xargs -0 shasum > data/train.sha1

Pitfalls

  • Different resize policies between training and inference.
  • Manual label edits with no audit trail.
  • No dataset fingerprinting before retraining.

Verification

  • Preprocess outputs are identical across repeated runs.
  • Invalid labels fail pipeline before training starts.
  • Dataset hash changes are visible in experiment logs.

Get New Tutorials by Email

No spam. Just clear, practical breakdowns you can apply right away.

Enjoy this tutorial?

Get new practical tech tutorials in your inbox.