Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems 30, Vol. 30 (Curran Associates, Inc.) - 介绍了Transformer架构及其最初的绝对正弦位置编码,本节讨论了其局限性。
Self-Attention with Relative Position Representations, Peter Shaw, Jakob Uszkoreit, Ashish Vaswani, 2018Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Vol. 1 (Association for Computational Linguistics)DOI: 10.18653/v1/N18-1059 - 提出了一种将相对位置信息显式编码到自注意力机制中的方法,解决了绝对位置编码的一个关键局限性。