Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin, 2017Advances in Neural Information Processing Systems, Vol. 30 (Curran Associates, Inc.) - 提出了Transformer架构,该架构完全依赖自注意力机制,并规范了注意力计算中的查询(Query)、键(Key)、值(Value)交互。