Long Short-Term Memory, Sepp Hochreiter, Jürgen Schmidhuber, 1997Neural Computation, Vol. 9 (MIT Press)DOI: 10.1162/neco.1997.9.8.1735 - The foundational paper introducing the Long Short-Term Memory (LSTM) architecture, detailing its cell state update mechanism and gates.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook covering the theoretical underpinnings of deep learning, including a detailed section on recurrent neural networks and LSTM architectures.
Recurrent Neural Networks (RNNs) and LSTMs (Lecture Slides), Tatsunori Hashimoto, Christopher Manning, 2023 (Stanford University) - Stanford University's CS224N course lecture slides provide a clear, in-depth explanation of LSTMs, including the cell state update and gradient flow advantages.