Random Search for Hyper-Parameter Optimization, James Bergstra, Yoshua Bengio, 2012Journal of Machine Learning Research, Vol. 13 (Journal of Machine Learning Research)DOI: 10.5555/2188385.2188410 - Introduces and empirically demonstrates the effectiveness of random search over grid search for hyperparameter optimization, a foundational paper in the field.
Practical Bayesian Optimization of Machine Learning Algorithms, Jasper Snoek, Hugo Larochelle, Ryan P. Adams, 2012Advances in Neural Information Processing Systems, Vol. 4 (Curran Associates) - A widely cited paper on the practical application of Bayesian optimization for tuning machine learning models, including neural networks.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook covering deep learning fundamentals, including chapters on regularization, optimization, and practical training strategies, directly relevant to hyperparameter tuning and model evaluation.