Layer 6: Dropout | Stochastic Regularization

Network Statistics

210

Active Units

210

Dropped Units

Operation Mode

Training

Inference

During training, each neuron is retained with probability p. This prevents units from co-adapting too much.

● Active: Signal propagates (Solid Core).
○ Dropped: Output forced to zero (Wireframe).

Hyperparameters

Dropout Rate (p) 0.50

2.00x

Inverted Scale

50%

Retention Probability

We scale active neurons by 1/(1-p) during training so no scaling is needed at test time.

نظرة عميقة: طبقة الإسقاط الثانية (Dropout)