All Courses

Making Predictions with model.predict()

Once you have trained your model using model.fit() and assessed its generalization performance on a separate test set using model.evaluate(), the next logical step is to use the model for its intended purpose: making predictions on new, unseen data points. This is where the model.predict() method comes into play. It takes input data and returns the model's output predictions.

Using `model.predict()`

The predict() method is straightforward to use. You pass it the input data for which you want predictions, and it returns the model's outputs. The input data should generally be in the form of a NumPy array or a tf.data.Dataset object, similar to what you would use for fit() or evaluate().

# Assume 'model' is a trained Keras model
# Assume 'new_data' is a NumPy array or tf.data.Dataset with new samples

predictions = model.predict(new_data)

print(predictions)

Input Data Shape

A common point of error is ensuring the input data has the correct shape. model.predict() expects a batch of samples, even if you are only predicting for a single instance. The input shape should typically match the input_shape the model was built with, but with an added batch dimension at the beginning.

For example, if your model expects input images of shape (28, 28, 1) (like MNIST digits), and you want to predict on a single image stored in a NumPy array single_image with shape (28, 28, 1), you need to add a batch dimension:

import numpy as np
import tensorflow as tf

# Assume 'model' expects input shape (None, 28, 28, 1)
# Assume 'single_image' is a NumPy array with shape (28, 28, 1)

# Add a batch dimension using np.expand_dims or slicing
input_for_predict = np.expand_dims(single_image, axis=0) # Shape becomes (1, 28, 28, 1)
# Alternatively: input_for_predict = single_image[tf.newaxis, ...]

prediction_for_single = model.predict(input_for_predict)
print(f"Shape of input passed to predict: {input_for_predict.shape}")
print(f"Prediction output: {prediction_for_single}")

If you pass multiple samples, they should be stacked along the first dimension:

# Assume 'multiple_images' is a NumPy array with shape (10, 28, 28, 1)
# (10 samples, each 28x28 grayscale)

predictions_for_multiple = model.predict(multiple_images)
print(f"Shape of predictions for multiple images: {predictions_for_multiple.shape}")
# Output shape will depend on the model's final layer, e.g., (10, num_classes) for classification

You can also pass a tf.data.Dataset directly to predict(). This is often efficient for large datasets as it uses the pipeline's batching and prefetching capabilities.

# Assume 'new_dataset' is a tf.data.Dataset yielding batches of new data
predictions_from_dataset = model.predict(new_dataset)
# The predictions will be concatenated into a single NumPy array

Interpreting the Output

The structure and meaning of the array returned by model.predict() depend entirely on the architecture of your model, specifically its final layer and activation function.

Regression: If your model performs regression (e.g., predicting house prices), the output will typically be a NumPy array where each element (or row, if predicting multiple values per sample) corresponds to the predicted continuous value for the respective input sample. If the final layer has one unit and no activation (or linear activation), the output shape for $N$ input samples would be $(N, 1)$ .
```
# Example: Predicting a single value per input
# Input shape: (N, num_features)
# Output shape: (N, 1)
print(predictions[0]) # Output: [predicted_value_for_sample_0]
```
Binary Classification: For binary classification, the final layer usually has one unit with a sigmoid activation function. model.predict() will return an array of probabilities (values between 0 and 1) for each sample belonging to the positive class. You typically apply a threshold (often 0.5) to convert these probabilities into class labels (0 or 1).
$\text{Predicted Class} = \begin{cases} 1 & \text{if } \text{prediction} > 0.5 \\ 0 & \text{otherwise} \end{cases}$
The output shape for $N$ input samples would be $(N, 1)$ .
```
probabilities = model.predict(new_data) # Shape (N, 1)
predicted_classes = (probabilities > 0.5).astype("int32")
print(f"Probabilities: {probabilities[0]}") # Output: e.g., [0.92]
print(f"Predicted class: {predicted_classes[0]}") # Output: [1]
```
Multi-class Classification: For multi-class classification with $C$ classes, the final layer typically has $C$ units with a softmax activation function. model.predict() returns an array where each row contains the probability distribution over the $C$ classes for the corresponding input sample. The sum of probabilities in each row will be approximately 1. To get the predicted class label, you usually find the index of the highest probability using np.argmax.
$\text{Predicted Class Index} = \underset{i}{\operatorname{argmax}}(\text{prediction}_i)$
The output shape for $N$ input samples would be $(N, C)$ .
```
probabilities_multi = model.predict(new_data) # Shape (N, C)
predicted_class_indices = np.argmax(probabilities_multi, axis=1)
print(f"Probabilities for first sample: {probabilities_multi[0]}") # Output: e.g., [0.1, 0.7, 0.2]
print(f"Predicted class index for first sample: {predicted_class_indices[0]}") # Output: 1
```

`predict()` vs. Direct Call `model()`

You might notice that you can also get predictions by calling the model instance directly as a function: predictions = model(new_data, training=False). While this works, model.predict() is generally preferred for inference on larger datasets.

model.predict(): Optimized for inference. It can handle NumPy arrays and tf.data.Dataset objects, processes data in batches (which can be specified via the batch_size argument for NumPy arrays, though it often infers it or processes all data if memory allows), and always runs in inference mode (e.g., dropout layers are inactive, batch normalization uses learned statistics).
model(data, training=False): More flexible TensorFlow-native way. It returns tensors instead of NumPy arrays. It's useful within custom training loops or when you need direct tensor outputs. Explicitly setting training=False is important to ensure layers like Dropout and BatchNormalization behave correctly during inference.

For standard prediction tasks on potentially large inputs, model.predict() is usually the more convenient and potentially more performant option.

Making predictions is the ultimate goal of building a supervised learning model. model.predict() provides a simple and efficient interface in Keras to apply your trained model to new data, allowing you to leverage the patterns it learned during training. Remember to always preprocess your new data in exactly the same way as your training data before feeding it to model.predict().

Was this section helpful?

Making Predictions with model.predict()

Using model.predict()

Input Data Shape

Interpreting the Output

predict() vs. Direct Call model()

Using `model.predict()`

`predict()` vs. Direct Call `model()`