Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - Provides a comprehensive explanation of optimization algorithms, including the role and impact of the learning rate in gradient descent.
Machine Learning (Coursera Course), Andrew Ng, 2012 (DeepLearning.AI and Stanford Online) - An introductory course that thoroughly explains gradient descent and practical strategies for selecting the learning rate.
Pattern Recognition and Machine Learning, Christopher M. Bishop, 2006 (Springer) - Provides a rigorous treatment of machine learning concepts, including the mathematical foundations of optimization methods like gradient descent.
Convex Optimization, Stephen Boyd, Lieven Vandenberghe, 2004 (Cambridge University Press) - A fundamental textbook on convex optimization, detailing theoretical aspects of gradient methods and step size selection.