Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems 30 (NIPS 2017), Vol. 30DOI: 10.48550/arXiv.1706.03762 - 介绍了Transformer架构,该架构大量依赖自注意力机制,并已成为现代NLP VAE中编码器和解码器组件的标准。