Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems 30 (Curran Associates, Inc.)DOI: 10.48550/arXiv.1706.03762 - 介绍了Transformer架构,包含多头自注意力与交叉注意力机制。
Denoising Diffusion Probabilistic Models, Jonathan Ho, Ajay N. Jain, Pieter Abbeel, 2020Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Vol. 33 (Neural Information Processing Systems Foundation, Inc.)DOI: 10.48550/arXiv.2006.11239 - 关于去噪扩散概率模型的开创性工作,为许多现代基于扩散的生成模型奠定了基础。