Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - This authoritative textbook offers a detailed exposition of optimization algorithms, including the mechanics of gradient descent, learning rates, and various cost functions in the context of machine learning.
Machine Learning Course Notes (CS229) - Linear Regression and Logistic Regression, Andrew Ng, 2008Stanford University (Stanford University) - These lecture notes from a renowned machine learning course offer a concise and mathematically grounded explanation of gradient descent, cost functions, and their practical application in machine learning.