Transformer, PyTorch Core Team, 2024 - Official documentation for the torch.nn.Transformer module, useful for implementation details.
Speech and Language Processing (3rd Edition Draft), Daniel Jurafsky and James H. Martin, 2025 (Stanford University) - A comprehensive textbook covering sequence-to-sequence models and the Transformer architecture in the context of natural language processing and speech.