2

Batch Normalization

Feature Standardization

Input
(24,64)
Features
64
Params
256
📊 Statistics
μ
0.00
Mean (μ)
σ²
1.00
Variance (σ²)
γ
1.00
Scale (γ)
β
0.00
Shift (β)
🔄 Mode
Training
Batch statistics
Inference
Running statistics
Normalization Formula
x̂ = (x - μ) / √(σ² + ε)
y = γ × x̂ + β

🎯 Purpose

BatchNorm stabilizes training by normalizing layer inputs. It reduces internal covariate shift and allows higher learning rates.

Effect: Notice how the "wild" orange data is centered and standardized into the blue distribution.

🎮 Controls
Scale (γ)1.00
Shift (β)0.00
Animation Speed1.0x
📐 Running Stats
0.12
Running Mean
0.95
Running Var

📈 During Training

Uses batch mean/variance. Updates running statistics with momentum for inference.