Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems, Vol. 30DOI: 10.48550/arXiv.1706.03762 - 这篇开创性论文介绍了完全基于注意力机制的Transformer架构,它已成为序列到序列任务的基础,也是基于Transformer的自编码器的基石。