Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems, Vol. 30 (Curran Associates, Inc.)DOI: 10.48550/arXiv.1706.03762 - 介绍了 Transformer 架构,该架构是许多现代计算密集型嵌入模型的基础,在大型 RAG 系统中发挥着重要作用。