Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems (NeurIPS) 30DOI: 10.48550/arXiv.1706.03762 - 介绍了Transformer架构及其自注意力机制,这为软路由中的加权求和计算提供了很好的类比。