When you train a neural network, the ultimate goal isn't just to achieve high accuracy on the data it was trained on. The true test of a model is its ability to generalize, meaning how well it performs on new, unseen data. Two common challenges that hinder generalization are underfitting and overfitting. Finding the right balance between these two is fundamental to building effective deep learning models.
Imagine you're trying to fit a curve to a set of data points.
An underfit model typically performs poorly on both the training data and the validation (or test) data. This indicates that the model lacks the capacity (e.g., not enough layers or neurons, or trained for too few epochs) to learn the meaningful relationships within the data.
Signs of underfitting include:
Visually, the training and validation loss curves might both remain high and show little improvement over epochs.
Overfitting occurs when the model learns the training data too well, including its noise and random fluctuations. While it might achieve excellent performance on the training set, its performance degrades significantly on unseen data. This happens when the model becomes too complex relative to the amount and quality of training data.
Signs of overfitting include:
Let's visualize how these phenomena typically appear in training and validation loss curves over epochs:
Comparing training loss (blue lines) and validation loss (orange lines) over epochs. Underfitting shows high loss for both. Overfitting shows training loss decreasing while validation loss starts increasing. A good fit shows both converging to a low value.
Underfitting and overfitting relate directly to the concepts of bias and variance in machine learning:
Ideally, we seek a model with low bias and low variance. However, there's often a trade-off: decreasing bias (making the model more complex) can increase variance, and decreasing variance (making the model simpler or adding constraints) can increase bias. Deep learning models, with their high capacity, are particularly prone to high variance (overfitting) if not managed correctly.
The most direct way to monitor for overfitting and underfitting during training in Keras is by using a validation dataset. When you call the model.fit()
method, you can provide validation data using the validation_data
or validation_split
argument. Keras will evaluate the loss and any specified metrics on this validation set at the end of each epoch.
# Assuming model is compiled and you have training/validation data
# history = model.fit(x_train, y_train,
# epochs=50,
# batch_size=128,
# validation_data=(x_val, y_val)) # Key step!
# After training, history.history contains loss and metrics
# train_loss = history.history['loss']
# val_loss = history.history['val_loss']
# train_acc = history.history['accuracy']
# val_acc = history.history['val_accuracy']
# Plot these values against epochs to diagnose
Plotting these history
metrics, like in the chart above, provides the essential diagnostic tool for understanding how your model is behaving. Recognizing the patterns of underfitting and overfitting in these plots is a significant skill in deep learning development.
Understanding these concepts is the first step. The following sections in this chapter will introduce concrete techniques like regularization, data augmentation, and callbacks, which are designed specifically to combat overfitting and help you achieve better model generalization.
© 2025 ApX Machine Learning