Long Short-Term Memory, Sepp Hochreiter, Jürgen Schmidhuber, 1997Neural Computation, Vol. 9 (MIT Press)DOI: 10.1162/neco.1997.9.8.1735 - Introduces the Long Short-Term Memory (LSTM) network architecture, essential for understanding gate mechanisms and handling long-range dependencies in recurrent neural networks.
Learning long-term dependencies with gradient descent is difficult, Yoshua Bengio, Patrice Simard, and Paolo Frasconi, 1994IEEE Transactions on Neural Networks, Vol. 5 (IEEE)DOI: 10.1109/72.279181 - Explores the inherent difficulties of training recurrent neural networks, particularly the vanishing gradient problem, which this section addresses with visualization techniques.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook covering the theoretical and practical aspects of deep learning, including recurrent neural networks, their challenges, and general principles of model analysis.
Getting Started with TensorBoard, TensorFlow Authors, 2024 (TensorFlow) - Official guide for TensorBoard, a visualization tool for deep learning model training, metrics, graphs, and performance, directly relevant to the 'Tools for Visualization' section.