Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - This foundational machine learning textbook explains how the gradient is used in optimization algorithms, particularly gradient descent, which is central to training deep learning models.
Mathematics for Machine Learning, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, 2020 (Cambridge University Press)DOI: 10.1017/9781108679904 - Specifically bridges the gap between fundamental mathematical concepts, including calculus and optimization, and their applications in machine learning.