All Courses

Understanding Overfitting and Underfitting

When you train a neural network, the ultimate goal isn't just to achieve high accuracy on the data it was trained on. The true test of a model is its ability to generalize, meaning how well it performs on new, unseen data. Two common challenges that hinder generalization are underfitting and overfitting. Finding the right balance between these two is fundamental to building effective deep learning models.

Imagine you're trying to fit a curve to a set of data points.

An underfit model is like trying to fit a straight line through a complex curve; it's too simple and fails to capture the underlying patterns in the data. It has high bias.
An overfit model is like drawing a curve that passes through every single data point perfectly, including the noise. It's too complex and tailored specifically to the training data, making it perform poorly on new data. It has high variance.
A good fit model finds a balance, capturing the essential trends without being overly sensitive to noise.

Identifying Underfitting

An underfit model typically performs poorly on both the training data and the validation (or test) data. This indicates that the model lacks the capacity (e.g., not enough layers or neurons, or trained for too few epochs) to learn the meaningful relationships within the data.

Signs of underfitting include:

High training loss and low training accuracy.
High validation loss and low validation accuracy.
The performance metrics might plateau early at unsatisfactory levels.

Visually, the training and validation loss curves might both remain high and show little improvement over epochs.

Identifying Overfitting

Overfitting occurs when the model learns the training data too well, including its noise and random fluctuations. While it might achieve excellent performance on the training set, its performance degrades significantly on unseen data. This happens when the model becomes too complex relative to the amount and quality of training data.

Signs of overfitting include:

Low training loss and high training accuracy.
Significantly higher validation loss and lower validation accuracy compared to the training set.
A noticeable gap emerges between the training and validation performance curves (loss or accuracy) during training, with the validation metric often worsening after reaching an optimal point while the training metric continues to improve.

Let's visualize how these phenomena typically appear in training and validation loss curves over epochs:

Comparing training loss (blue lines) and validation loss (orange lines) over epochs. Underfitting shows high loss for both. Overfitting shows training loss decreasing while validation loss starts increasing. A good fit shows both converging to a low value.

The Bias-Variance Trade-off

Underfitting and overfitting relate directly to the concepts of bias and variance in machine learning:

Bias: Represents errors from incorrect assumptions in the learning algorithm. High bias (underfitting) means the model is too simple to capture the data's complexity.
Variance: Represents errors from sensitivity to small fluctuations in the training set. High variance (overfitting) means the model is too complex and learns noise specific to the training data, failing to generalize.

Ideally, we seek a model with low bias and low variance. However, there's often a trade-off: decreasing bias (making the model more complex) can increase variance, and decreasing variance (making the model simpler or adding constraints) can increase bias. Deep learning models, with their high capacity, are particularly prone to high variance (overfitting) if not managed correctly.

Monitoring During Training

The most direct way to monitor for overfitting and underfitting during training in Keras is by using a validation dataset. When you call the model.fit() method, you can provide validation data using the validation_data or validation_split argument. Keras will evaluate the loss and any specified metrics on this validation set at the end of each epoch.

# Assuming model is compiled and you have training/validation data
# history = model.fit(x_train, y_train,
#                     epochs=50,
#                     batch_size=128,
#                     validation_data=(x_val, y_val)) # Important step!

# After training, history.history contains loss and metrics
# train_loss = history.history['loss']
# val_loss = history.history['val_loss']
# train_acc = history.history['accuracy']
# val_acc = history.history['val_accuracy']

# Plot these values against epochs to diagnose

Plotting these history metrics, like in the chart above, provides the essential diagnostic tool for understanding how your model is behaving. Recognizing the patterns of underfitting and overfitting in these plots is a significant skill in deep learning development.

Understanding these concepts is the first step. The following sections in this chapter will introduce concrete techniques like regularization, data augmentation, and callbacks, which are designed specifically to combat overfitting and help you achieve better model generalization.

Was this section helpful?