Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems (NeurIPS) 30DOI: 10.48550/arXiv.1706.03762 - Introduced the Transformer architecture, which relies entirely on self-attention mechanisms and serves as a common encoder in modern ASR models.