Alright, let's translate theory into practice. You've trained a model, evaluated it, and saved it. Now, it's time to make it accessible so applications can request predictions from it. This hands-on exercise walks through creating a simple web API for your model using Flask and then packaging it into a Docker container for easy deployment and scaling.
We'll assume you have a trained scikit-learn model saved to a file (e.g., model.joblib
). For this example, let's imagine it's a simple classifier trained on data with four features, similar to the Iris dataset. You'll also need Python, pip, and Docker installed on your system.
First, create a directory for your project. Inside this directory, we'll place our application code, the model file, dependency information, and the Docker configuration.
mkdir model_api_project
cd model_api_project
# Assume you copy your trained model here
# cp path/to/your/model.joblib ./model.joblib
touch app.py requirements.txt Dockerfile
Your directory should look like this:
model_api_project/
├── app.py # Flask application code
├── model.joblib # Your saved machine learning model
├── requirements.txt # Python dependencies
└── Dockerfile # Instructions for building the Docker image
We'll use Flask, a lightweight Python web framework, to create an API endpoint that accepts data, uses our loaded model to make a prediction, and returns the result.
Edit app.py
and add the following code:
import joblib
import numpy as np
from flask import Flask, request, jsonify
# 1. Load the trained model
# Make sure 'model.joblib' is in the same directory as app.py
# Or provide the full path to the model file.
try:
model = joblib.load('model.joblib')
print("Model loaded successfully.")
except FileNotFoundError:
print("Error: model.joblib not found. Make sure the model file is in the correct directory.")
model = None # Set model to None if loading fails
except Exception as e:
print(f"Error loading model: {e}")
model = None # Set model to None if loading fails
# 2. Create the Flask application instance
app = Flask(__name__)
# 3. Define the prediction endpoint
@app.route('/predict', methods=['POST'])
def predict():
if model is None:
return jsonify({'error': 'Model not loaded, cannot make predictions.'}), 500
try:
# Get data from the POST request body
data = request.get_json()
# Validate input data structure (basic example)
if 'features' not in data or not isinstance(data['features'], list):
return jsonify({'error': 'Missing or invalid "features" field in JSON payload. Expected a list.'}), 400
# Assuming the model expects a 2D array-like structure for prediction
# (e.g., for scikit-learn models)
# Perform basic validation on feature count if possible
# Example: if your model expects 4 features
# if len(data['features']) != 4:
# return jsonify({'error': f'Expected 4 features, got {len(data["features"])}'}), 400
features = np.array(data['features']).reshape(1, -1) # Reshape for single prediction
# Make prediction
prediction = model.predict(features)
# Prepare the response
# Convert numpy array types to standard Python types for JSON serialization
# Example: If prediction is numeric
prediction_result = prediction[0].item() if isinstance(prediction[0], np.generic) else prediction[0]
return jsonify({'prediction': prediction_result})
except ValueError as ve:
# Handle potential errors during array conversion or prediction
return jsonify({'error': f'Error processing features: {ve}'}), 400
except Exception as e:
# Generic error handler
app.logger.error(f"Prediction error: {e}") # Log the error for debugging
return jsonify({'error': 'An internal error occurred during prediction.'}), 500
# 4. Run the Flask app
# This block allows running the app directly using `python app.py`
# Debug mode should be False in a production environment
if __name__ == '__main__':
# Make sure the server is accessible externally if running in Docker
# 0.0.0.0 makes it listen on all available network interfaces
app.run(host='0.0.0.0', port=5000, debug=False)
Explanation:
joblib.load()
to load our pre-trained model file. Basic error handling is included./predict
that only accepts POST requests (as we're sending data to it).predict
function, we get the JSON data sent in the request body using request.get_json()
.features
containing a list of numerical feature values (e.g., {"features": [5.1, 3.5, 1.4, 0.2]}
). Basic input validation is added.model.predict()
is called with the features.jsonify
.if __name__ == '__main__':
block allows you to run the server locally for testing using python app.py
. We set host='0.0.0.0'
to make it accessible from outside the container later, and port=5000
is a common choice for development servers. debug=False
is important for anything beyond local testing.List the Python packages your application needs in requirements.txt
.
# requirements.txt
Flask>=2.0.0,<3.0.0
scikit-learn>=1.0.0,<1.4.0 # Or the version compatible with your model
joblib>=1.1.0
numpy>=1.21.0
Note: Adjust version numbers based on your environment or specific needs. Using specific compatible versions is generally recommended for stability.
The Dockerfile
provides instructions to Docker for building an image containing your application, its dependencies, and the necessary runtime environment.
Edit Dockerfile
and add the following content:
# Dockerfile
# 1. Use an official Python runtime as a parent image
# Using a 'slim' version reduces the image size
FROM python:3.9-slim
# 2. Set the working directory inside the container
WORKDIR /app
# 3. Copy the requirements file into the container at /app
COPY requirements.txt .
# 4. Install any needed packages specified in requirements.txt
# --no-cache-dir reduces layer size, --trusted-host handles potential network issues
RUN pip install --no-cache-dir --trusted-host pypi.python.org -r requirements.txt
# 5. Copy the rest of the application code (app.py, model.joblib) into the container
COPY . .
# 6. Expose the port the app runs on
EXPOSE 5000
# 7. Define the command to run the application when the container starts
CMD ["python", "app.py"]
Explanation:
FROM python:3.9-slim
: Specifies the base image. We use a slim Python 3.9 image.WORKDIR /app
: Sets the default directory for subsequent commands inside the container.COPY requirements.txt .
: Copies the requirements.txt
file from your host machine into the container's /app
directory.RUN pip install ...
: Executes the command to install the dependencies listed in requirements.txt
.COPY . .
: Copies all remaining files from the current directory on your host (including app.py
and model.joblib
) into the container's /app
directory.EXPOSE 5000
: Informs Docker that the container will listen on port 5000 at runtime. This doesn't actually publish the port; it's more like documentation.CMD ["python", "app.py"]
: Specifies the default command to execute when a container based on this image is started. This runs our Flask application.Now, open your terminal or command prompt, navigate to the model_api_project
directory, and run the Docker build command:
docker build -t model-api:latest .
docker build
: The command to build an image.-t model-api:latest
: Tags the image with a name (model-api
) and a tag (latest
). This makes it easier to reference later..
: Specifies the build context (the current directory), which contains the Dockerfile
and all necessary files.Docker will execute the steps in your Dockerfile
, downloading the base image, installing dependencies, and copying your files. This might take a few minutes the first time.
Once the image is built successfully, you can run a container from it:
docker run -p 5001:5000 model-api:latest
docker run
: The command to create and start a container from an image.-p 5001:5000
: Publishes the container's port 5000 to port 5001 on your host machine. This means requests sent to localhost:5001
on your computer will be forwarded to port 5000 inside the container where the Flask app is listening. You can change 5001
to another available port if needed.model-api:latest
: Specifies the image to use for the container.You should see output indicating that the Flask server is running, similar to when you run python app.py
locally, including the "Model loaded successfully" message if model.joblib
was found.
With the container running, open another terminal or use a tool like Postman or Insomnia to send a POST request to the API. Using curl
:
curl -X POST http://localhost:5001/predict \
-H "Content-Type: application/json" \
-d '{"features": [5.1, 3.5, 1.4, 0.2]}'
Explanation:
curl
: A command-line tool for transferring data with URLs.-X POST
: Specifies the HTTP method as POST.http://localhost:5001/predict
: The URL of your running API endpoint (using the host port you published).-H "Content-Type: application/json"
: Sets the content type header, indicating we are sending JSON data.-d '{"features": [5.1, 3.5, 1.4, 0.2]}'
: The data payload sent in the request body. Make sure the number and order of features match what your model.joblib
expects.If successful, the API running inside the container will process the request and return a JSON response like:
{"prediction": 0}
(The actual prediction value 0
depends on your specific model.joblib
).
You can stop the running container by going back to the terminal where docker run
is executing and pressing Ctrl+C
.
Congratulations! You have successfully:
Dockerfile
to specify the container build process.This forms the foundation of model deployment. From here, you could explore adding more robust error handling, input validation, logging, deploying to cloud platforms, or using more advanced serving frameworks. This hands-on exercise provides a significant step in bridging the gap between developing a model and making it usable in real applications.
© 2025 ApX Machine Learning