Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - Provides a comprehensive theoretical foundation and detailed explanation of optimization algorithms, including SGD with momentum, within the context of deep learning.
SGD - torch.optim, PyTorch Developers, 2024PyTorch Documentation - Official documentation for the Stochastic Gradient Descent optimizer in PyTorch, detailing its parameters, including the momentum coefficient, and practical usage.