Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems 30 (NIPS 2017)DOI: 10.48550/arXiv.1706.03762 - 介绍了Transformer架构、其编码器-解码器设计、自注意力机制和位置编码的基础论文。
Transformer API Reference, PyTorch Documentation, 2024 (PyTorch Foundation) - PyTorch内置Transformer模块的官方文档,提供了实现细节和API用法。