Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems 30 (NIPS 2017)DOI: https://doi.org/10.48550/arXiv.1706.03762 - 介绍Transformer架构的奠基性论文,该架构是现代大型语言模型的关键,使其能够处理和生成文本。