Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017arXivDOI: 10.48550/arXiv.1706.03762 - The original paper introducing the Transformer architecture, self-attention, and the necessity and mechanism of positional encoding.
Speech and Language Processing (3rd ed. draft), Daniel Jurafsky and James H. Martin, 2025 - A comprehensive textbook covering sequence models like RNNs and the Transformer, providing context for why positional information is essential in NLP.
Transformers and Self-Attention (CS224N Winter 2023 Lecture Notes), Tatsunori Hashimoto, 2023 (Stanford University) - Course material from a leading university providing clear explanations of the Transformer architecture and the reasoning behind positional encoding.