Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017arXiv preprint arXiv:1706.03762DOI: 10.48550/arXiv.1706.03762 - 介绍Transformer架构的基础论文,该架构构成了当前大型语言模型及其上下文窗口机制的基础。