When you train a neural network using the fit()
method, the optimizer works diligently to minimize the loss function calculated on the training data. Metrics reported on this data, like training accuracy, tell you how well the model is learning the specific examples it's seeing. However, our ultimate goal isn't just to perform well on data the model has already seen; it's to build a model that generalizes well to new, unseen data. How can we estimate this generalization capability during the training process? This is where validation data comes in.
Simply monitoring the training loss and metrics can be misleading. A model might become extremely good at predicting the training examples, essentially "memorizing" them, including their noise and idiosyncrasies. This phenomenon, known as overfitting, leads to excellent performance on the training set but poor performance when faced with new data. The model fails to capture the underlying patterns required for generalization.
To get a more realistic assessment of how our model might perform on unseen data, we need to evaluate it periodically on a separate dataset that it doesn't train on. This is the validation set. There are two primary ways to incorporate validation during training in Keras:
Using validation_split
: You can reserve a fraction of your training data specifically for validation. Keras will automatically split off the specified percentage before training begins and use it exclusively for validation at the end of each epoch. This is convenient if you don't have a pre-defined validation set.
# Assuming x_train and y_train hold your full training data
history = model.fit(x_train, y_train,
epochs=30,
batch_size=64,
validation_split=0.2) # Use the last 20% of data for validation
Using validation_data
: You can provide a tuple (x_val, y_val)
containing features and labels for an entirely separate validation dataset. This is often preferred as it gives you more control over ensuring the validation set is representative and distinct from the training process.
# Assuming x_train_p, y_train_p are the training portion
# and x_val, y_val are the separate validation set
history = model.fit(x_train_p, y_train_p,
epochs=30,
batch_size=64,
validation_data=(x_val, y_val)) # Use the provided validation set
It's important that the validation data comes from the same underlying distribution as the training data but contains examples the model has not seen during training updates. The model's weights are never updated based on the validation set's performance; it's purely for monitoring.
When you use either validation_split
or validation_data
with the fit()
method, Keras evaluates the model on the validation set at the end of each epoch. The results are added to the History
object returned by fit()
. These metrics are typically prefixed with val_
. For example, if you compiled your model with loss='binary_crossentropy'
and metrics=['accuracy']
, the history.history
dictionary will contain keys like:
loss
: Training loss for each epoch.accuracy
: Training accuracy for each epoch.val_loss
: Validation loss for each epoch.val_accuracy
: Validation accuracy for each epoch.Plotting these metrics over epochs is standard practice for understanding the training dynamics.
Analyzing the plots of training versus validation loss and metrics is fundamental for diagnosing potential issues like overfitting or underfitting.
Ideal Scenario: Both training and validation loss decrease steadily and converge to similar low values. Likewise, training and validation accuracy increase and plateau at similar high values. This suggests the model is learning generalizable patterns.
Overfitting: The training loss continues to decrease (or training accuracy increases), but the validation loss starts to increase (or validation accuracy plateaus or decreases) after some number of epochs. This divergence indicates the model is starting to memorize the training data and is losing its ability to generalize. The gap between the training and validation curves widens significantly.
Training loss steadily decreases while validation loss starts increasing after epoch 20, indicating overfitting.
Both training and validation loss are high and decrease slowly, indicating underfitting.
Monitoring validation performance is therefore essential. It provides a proxy for how the model will perform on real-world, unseen data and helps identify the point during training where the model achieves the best generalization. Recognizing overfitting early allows you to stop training before the model's performance on new data degrades. Techniques like early stopping and regularization, which we will discuss in Chapter 6, directly use validation monitoring to improve model generalization. For now, understand that watching the val_
metrics during fit()
is your primary tool for assessing if your training is heading in the right direction.
© 2025 ApX Machine Learning