Once your machine learning model has processed the input data and generated a prediction, the final step in the request-response cycle is to send these results back to the client. How you structure this response is significant for usability and integration with downstream applications. Simply returning a raw number or string often isn't sufficient. Clients usually benefit from a well-defined JSON object containing the prediction and potentially other relevant information, like confidence scores or probabilities.
FastAPI, in conjunction with Pydantic, makes defining and enforcing these response structures straightforward using response models.
Let's start with a basic scenario: a classification model that predicts a single category label. While you could return just the label as a string, it's better practice to wrap it in a JSON object. This provides context and makes the API response more self-descriptive.
We can define a Pydantic model to represent this structure:
from pydantic import BaseModel, Field
class PredictionResponse(BaseModel):
predicted_class: str = Field(..., description="The predicted class label.")
# You might add other fields later, like a request ID or model version
In your endpoint, you would then use this model in the response_model
parameter of the path operation decorator. FastAPI automatically handles serializing your return value (e.g., a dictionary or another Pydantic model instance) into JSON conforming to PredictionResponse
.
# Assume 'model' is your loaded ML model object
# Assume 'InputFeatures' is your Pydantic input model
@app.post("/predict", response_model=PredictionResponse)
async def make_prediction(features: InputFeatures):
# 1. Preprocess features if necessary
processed_data = preprocess(features.dict()) # Example preprocessing
# 2. Get prediction from the model
# Assume model.predict() returns the class label directly
prediction_label = model.predict(processed_data)
# 3. Return the result conforming to the response model
return PredictionResponse(predicted_class=prediction_label)
This approach ensures the response format is consistent and automatically includes it in the API documentation generated by FastAPI.
For many classification tasks, knowing the model's confidence in its prediction is just as important as the prediction itself. Most classification models can output probabilities for each possible class. For instance, scikit-learn classifiers often have a predict_proba()
method that returns an array of probabilities, one for each class.
Returning these probabilities provides valuable context to the client. They might use this information to set decision thresholds or identify uncertain predictions that require human review.
To include probabilities, we extend our Pydantic response model:
from pydantic import BaseModel, Field
from typing import List, Dict
class ProbabilityResponse(BaseModel):
predicted_class: str = Field(..., description="The predicted class label.")
probabilities: Dict[str, float] = Field(..., description="A dictionary mapping class labels to their predicted probabilities.")
# Example: {"cat": 0.95, "dog": 0.04, "other": 0.01}
@app.post("/predict_proba", response_model=ProbabilityResponse)
async def make_prediction_with_proba(features: InputFeatures):
processed_data = preprocess(features.dict())
# Assume model.predict() gives the label
prediction_label = model.predict(processed_data)
# Assume model.predict_proba() gives probabilities per class
# and model.classes_ gives the order of classes
proba_values = model.predict_proba(processed_data)[0] # Get probabilities for the first (only) input sample
class_labels = model.classes_
probabilities_dict = {label: proba for label, proba in zip(class_labels, proba_values)}
# Sort probabilities for better readability (optional)
sorted_probabilities = dict(sorted(probabilities_dict.items(), key=lambda item: item[1], reverse=True))
return ProbabilityResponse(
predicted_class=prediction_label,
probabilities=sorted_probabilities
)
In this example, ProbabilityResponse
defines a structure that includes both the most likely class (predicted_class
) and a dictionary (probabilities
) containing the probability associated with each possible class. The endpoint retrieves both the prediction and the probabilities from the model and packages them according to this structure.
A visual representation of these probabilities can sometimes be helpful for understanding the model's confidence distribution.
Example distribution of predicted probabilities across different classes for a single input instance. Class A has the highest probability.
The structure of your response model should naturally reflect the output of your specific machine learning model:
predicted_class
and probabilities
, you might return a single predicted_value
(float) and potentially a confidence_interval
(e.g., a dictionary with lower_bound
and upper_bound
).The principle remains the same: define a Pydantic model that accurately describes the expected output format and use it in the response_model
parameter of your endpoint decorator. This provides clear contracts for your API consumers and leverages FastAPI's validation and documentation features.
Remember to consider numerical precision when returning floating-point numbers like probabilities. You might want to round them to a reasonable number of decimal places within your endpoint logic before returning the response. Also, ensure your endpoint handles potential errors during prediction gracefully, perhaps returning a specific HTTP error code and message instead of the standard prediction response.
© 2025 ApX Machine Learning