Chapter 3: Acoustic Modeling

After converting raw audio into a sequence of feature vectors, the next step is to map those features to the basic sounds of a language. This is the primary function of the acoustic model. It addresses the question: given a small segment of audio, what is the probability that a specific phoneme, like /k/, /æ/, or /t/, was spoken?

The acoustic model provides the statistical relationship between the audio signal and its corresponding phonetic units. It computes the likelihood $P(\text{audio_features} | \text{phoneme})$ , a probability that serves as a key input for the final transcription process.

In this chapter, you will cover:

The function of an acoustic model within the ASR pipeline.
The traditional approach of using Gaussian Mixture Models (GMMs) to represent audio features for each phoneme.
How Hidden Markov Models (HMMs) are used to handle the sequential patterns in speech.
An introduction to how neural networks are used for modern acoustic modeling.

By the end, you will have a clear picture of how this component connects processed sound to the building blocks of speech.

Sections

3.1 What is an Acoustic Model?
3.2 Mapping Sounds to Phonemes
3.3 Early Approaches: Gaussian Mixture Models (GMMs)
3.4 Hidden Markov Models (HMMs) for Sequential Data
3.5 Combining GMMs and HMMs
3.6 Introduction to Neural Network-based Acoustic Models
3.7 The Role of an Acoustic Model in an ASR System

Chapter 3: Acoustic Modeling

In this chapter, you will cover:

The function of an acoustic model within the ASR pipeline.
The traditional approach of using Gaussian Mixture Models (GMMs) to represent audio features for each phoneme.
How Hidden Markov Models (HMMs) are used to handle the sequential patterns in speech.
An introduction to how neural networks are used for modern acoustic modeling.

By the end, you will have a clear picture of how this component connects processed sound to the building blocks of speech.

Sections

3.1 What is an Acoustic Model?
3.2 Mapping Sounds to Phonemes
3.3 Early Approaches: Gaussian Mixture Models (GMMs)
3.4 Hidden Markov Models (HMMs) for Sequential Data
3.5 Combining GMMs and HMMs
3.6 Introduction to Neural Network-based Acoustic Models
3.7 The Role of an Acoustic Model in an ASR System