SGD, PyTorch Developers, 2024 (PyTorch Foundation) - Official documentation for the PyTorch SGD optimizer, which includes implementations of standard SGD, Momentum, and Nesterov Accelerated Gradient.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A foundational textbook offering a thorough academic treatment of deep learning, with sections dedicated to the theory and mechanics of Stochastic Gradient Descent, Momentum, and Nesterov Accelerated Gradient.