tf.GradientTape guide, TensorFlow Authors, 2024 (TensorFlow) - Official TensorFlow guide that details the API and usage of tf.GradientTape for automatic differentiation, including specifics like persistent tapes and higher-order gradients.
Automatic Differentiation in Machine Learning: A Survey, Atilim Gunes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, Jeffrey Mark Siskind, David Eugene Stewart, Zachary DeVito, James Bradbury, 2018Journal of Machine Learning Research, Vol. 18 (Microtome Publishing)DOI: 10.5555/3342930.3342931 - A comprehensive academic survey providing a detailed account of automatic differentiation methods, their history, and their widespread application in machine learning, with a focus on reverse-mode AD.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A foundational textbook that thoroughly explains backpropagation, which is a specific instance of reverse-mode automatic differentiation applied to neural networks, and various gradient-based optimization algorithms. Chapters 6 and 8 are particularly relevant.