Speech and Language Processing, Daniel Jurafsky and James H. Martin, 2025 - A comprehensive textbook providing detailed explanations of speech recognition, including decoding algorithms like greedy and beam search, and the role of language models in ASR. Chapter 10 "Speech Recognition" and Chapter 12 "Speech Synthesis, Statistical Models, and Deep Learning for ASR" are particularly relevant.
Practical End-to-End Speech Recognition with Beam Search, Veronica Tozzo, Federico Tomasi, Margherita Squillario, Annalisa Barla, 2018arXiv preprint arXiv:1811.09673DOI: 10.48550/arXiv.1811.09673 - This paper provides a practical overview of end-to-end speech recognition, specifically focusing on beam search decoding with CTC and the integration of language models, which directly addresses the core topics of the section.