Improving Language Understanding by Generative Pre-Training, Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever, 2018OpenAI (OpenAI) - Presents the original GPT model, which implements learned positional embeddings as part of its architecture.
Speech and Language Processing (3rd Edition), Daniel Jurafsky and James H. Martin, 2025 (Prentice Hall) - A comprehensive textbook with detailed explanations of Transformer models, including various approaches to positional encoding.