Positional Encoding Techniques

Was this section helpful?

References

Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017 Advances in Neural Information Processing Systems 30 DOI: 10.48550/arXiv.1706.03762 - Introduces the Transformer architecture and the fixed sinusoidal positional encoding mechanism.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, 2018 Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) DOI: 10.48550/arXiv.1810.04805 - Introduces BERT, a Transformer-based language model that utilizes learned positional embeddings.
Speech and Language Processing (3rd ed. draft), Daniel Jurafsky, James H. Martin, 2025 - A textbook offering explanations of Transformer architectures, attention mechanisms, and positional encoding methods.
torch.nn.Embedding, PyTorch Authors, 2024 - Official documentation for PyTorch's nn.Embedding module, used for implementing learned positional embeddings.