Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - Provides a foundational and comprehensive treatment of neural networks, including detailed theoretical and practical discussions on hyperparameters, optimization methods, and training strategies.
Random Search for Hyper-Parameter Optimization, James Bergstra, Yoshua Bengio, 2012Journal of Machine Learning Research, Vol. 13 (Journal of Machine Learning Research)DOI: 10.5555/2188385.2188395 - Introduces Random Search and demonstrates its efficiency over Grid Search for hyperparameter optimization, particularly when not all hyperparameters are equally important.
Practical Bayesian Optimization of Machine Learning Algorithms, Jasper Snoek, Hugo Larochelle, Ryan P. Adams, 2012Advances in Neural Information Processing Systems (NIPS) 25 (Elsevier B.V.)DOI: 10.5555/2999134.2999201 - A seminal work on applying Bayesian optimization for tuning machine learning hyperparameters, offering a more intelligent approach than exhaustive or random searches.