Learning representations by back-propagating errors, David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams, 1986Nature, Vol. 323 (Springer Nature)DOI: 10.1038/323533a0 - Presents the backpropagation algorithm as an efficient method for computing gradients in multi-layer neural networks, significantly advancing their training.
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - A comprehensive textbook where Chapter 6 covers deep feedforward networks, including a thorough mathematical derivation and explanation of backpropagation.
Neural Networks and Deep Learning, Michael A. Nielsen, 2015 (Determination Press) - Provides a clear, step-by-step explanation of how backpropagation works, making the underlying calculus intuitive and accessible.