Challenges in Sequence-to-Sequence Tasks

Was this section helpful?

References

Sequence to Sequence Learning with Neural Networks, Ilya Sutskever, Oriol Vinyals, Quoc V. Le, 2014 Advances in Neural Information Processing Systems 27 (NIPS 2014) - Introduces the foundational encoder-decoder architecture for sequence-to-sequence learning using LSTMs, illustrating the early approach to these challenges.
Neural Machine Translation by Jointly Learning to Align and Translate, Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, 2014 International Conference on Learning Representations (ICLR 2015, poster) - Presents the attention mechanism, an important innovation that addresses the fixed-context bottleneck in sequence-to-sequence models by allowing the decoder to selectively focus on parts of the input.
Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017 Advances in Neural Information Processing Systems 30 (NIPS 2017) - Introduces the Transformer model, which exclusively relies on attention mechanisms to achieve state-of-the-art results in sequence-to-sequence tasks, effectively overcoming the limitations discussed.
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Daniel Jurafsky, James H. Martin, 2023 (Draft textbook, Stanford University) - Provides a comprehensive introduction to natural language processing concepts, including sequence modeling, recurrent neural networks, and the challenges inherent in processing sequential data.