Practice: Evaluating and Building a Demo Application
Was this section helpful?
Speech and Language Processing (3rd ed. draft), Daniel Jurafsky, James H. Martin, 2025 - Provides detailed explanations of Automatic Speech Recognition (ASR) evaluation metrics, including Word Error Rate (WER), in a widely recognized textbook.
Transformers Library - Documentation, Hugging Face team, 2024 - The official documentation for the Hugging Face Transformers library, providing detailed guides on using pre-trained models and the pipeline API for tasks such as ASR.
Robust Speech Recognition via Large-Scale Weak Supervision, Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever, 2022arXiv preprint arXiv:2212.04356DOI: 10.48550/arXiv.2212.04356 - Introduces the Whisper model, a multi-task speech recognition model trained on a large and diverse dataset, which is utilized in the demo application for its high performance.