Sequence to Sequence Learning with Neural Networks, Ilya Sutskever, Oriol Vinyals, Quoc V. Le, 2014Advances in Neural Information Processing Systems (NIPS) 27 - 提出了一种通用的端到端序列学习方法,通过固定大小的上下文向量展示了编码器-解码器结构。
Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems (NIPS) 30, Vol. 30 (Curran Associates, Inc.)DOI: 10.48550/arXiv.1706.03762 - 介绍了Transformer模型,旨在克服循环模型在序列计算和长程依赖捕捉方面的局限。
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - 为循环神经网络、通过时间反向传播及相关训练难题提供了基础。