Adam: A Method for Stochastic Optimization, Diederik P. Kingma and Jimmy Ba, 20143rd International Conference for Learning RepresentationsDOI: 10.48550/arXiv.1412.6980 - The original research paper introducing the Adam optimizer, detailing its algorithmic mechanics, bias correction, and empirical performance.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A widely recognized textbook that provides a comprehensive overview of deep learning concepts, including a detailed explanation of the Adam optimization algorithm.
Optimizers in Deep Learning, Andrej Karpathy, Justin Johnson, Serena Yeung, et al., 2023Stanford University CS231n: Convolutional Neural Networks for Visual Recognition, Lecture Notes (Stanford University) - Educational notes from a popular Stanford course, offering a clear and accessible explanation of Adam and other optimization algorithms for deep learning.