Long Short-Term Memory, Sepp Hochreiter, Jürgen Schmidhuber, 1997Neural Computation, Vol. 9 (MIT Press)DOI: 10.1162/neco.1997.9.8.1735 - The original paper introducing the Long Short-Term Memory (LSTM) network and its gating mechanisms to address gradient issues in RNNs.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - An authoritative textbook that provides a detailed explanation of recurrent neural networks, vanishing/exploding gradients, and the architecture and function of LSTM networks.
Recurrent Neural Networks (RNNs) and LSTMs, Chris Manning, 2023 (Stanford University) - Lecture notes from a leading university course, offering clear explanations of RNN limitations, the concept of gating, and the structure of LSTMs.