Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems 30 (Curran Associates, Inc.)DOI: 10.48550/arXiv.1706.03762 - Introduces the Transformer architecture, including the foundational multi-head self-attention and cross-attention mechanisms.
Denoising Diffusion Probabilistic Models, Jonathan Ho, Ajay N. Jain, Pieter Abbeel, 2020Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Vol. 33 (Neural Information Processing Systems Foundation, Inc.)DOI: 10.48550/arXiv.2006.11239 - Foundational work on denoising diffusion probabilistic models, providing the basis for many modern diffusion-based generative models.
High-Resolution Image Synthesis with Latent Diffusion Models, Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Bjorn Ommer, 2022Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)DOI: 10.48550/arXiv.2112.10752 - Details the architecture of Latent Diffusion Models, which use cross-attention to integrate text conditioning for image generation.