Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems (NeurIPS), Vol. 30 (Curran Associates, Inc.)DOI: 10.48550/arXiv.1706.03762 - A foundational paper that introduced the Transformer architecture and popularized the self-attention mechanism, explaining its core computational components applicable across various deep learning tasks.
Generative Deep Learning: Teaching Machines to Paint, Write, Compose, and Play, David Foster, 2023 (O'Reilly Media) - A comprehensive book that provides practical guidance and theoretical foundations for generative models, including detailed discussions on GANs and their architectural advancements, such as the role of attention mechanisms.