Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems 30 (NIPS 2017)DOI: 10.48550/arXiv.1706.03762 - 这篇基础论文介绍了Transformer架构,为理解RoPE等位置编码机制提供了必要背景。
Self-Attention with Relative Position Representations, Peter Shaw, Jakob Uszkoreit, Ashish Vaswani, 2018Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)DOI: 10.48550/arXiv.1803.02155 - 本文提出了一种将相对位置表示引入自注意力机制的早期方法,为与RoPE的方法进行比较提供了参考。