Manual hyperparameter tuning is computationally irresponsible. We employed Bayesian Optimization — an intelligent search strategy that finds the global optimum in just 30 iterations instead of thousands.
"Instead of blindly searching, we teach the optimizer to think."
Grid Search tests every combination blindly. With 4 hyperparameters, this requires ~10,000+ evaluations. Bayesian Optimization uses a probabilistic model to find the optimum in just 30 smart trials.
Objective: Clear the "Knowledge Fog" to find the global optimum (lowest RMSE).
Each click adds an "observation." The GP updates its belief about the function shape. Notice how the uncertainty band narrows near observations and remains wide in unexplored regions. The optimizer samples where uncertainty is high (exploration) or where the predicted value is good (exploitation).
The EI+ (Expected Improvement Plus) acquisition function decides where to sample next by balancing two strategies:
Sample in regions of HIGH uncertainty. The algorithm ventures into unknown territory to discover potentially better configurations. This prevents getting stuck in local minima.
Watch the optimizer converge toward the global minimum over 30 iterations. Each point represents a hyperparameter combination tested.
Adjust the hyperparameters yourself and see how far your choice is from the Bayesian-optimized values. Can you beat the optimizer?
After 30 Bayesian iterations (~5 hours of compute on Intel i5, 16GB RAM), the algorithm converged to this configuration:
The optimizer converged to 210 BiLSTM units (near max) but only 10.4% dropout (near minimum). This reveals that the physics features provide such clean signal that aggressive regularization is unnecessary.