Packaging a machine learning model into a self-contained, portable Docker container is a foundational skill for deploying models in any modern environment. This practical guide demonstrates how to combine containerization with Docker and wrapping models in a web API for this purpose.
Our goal is to take a pre-trained model, a web application that serves it, and all its dependencies, and bundle them into a single unit called a Docker image. From this image, we can launch a container that runs our model as an isolated service.
Before we begin, ensure you have Docker Desktop installed and running on your machine. You can download it from the official Docker website.
We will create a few files for this exercise. Let's set up a new directory for our project to keep things organized.
The file structure for our simple model deployment project.
First, we need something to package. We'll use a simple scikit-learn model and a Flask application to serve it. You don't need to train the model yourself; for this exercise, we will assume you have a model.pkl file.
The Model (model.pkl)
For this guide, our model.pkl represents a simple pre-trained regression model. It expects an input of four numerical features and returns a single prediction.
The Application (app.py)
This Python script uses the Flask framework to create a simple web server. It loads our model.pkl file and exposes a /predict endpoint that accepts POST requests with JSON data.
Create a file named app.py and add the following code:
import joblib
from flask import Flask, request, jsonify
# 1. Initialize Flask App
app = Flask(__name__)
# 2. Load the pre-trained model
# This is loaded only once when the app starts
model = joblib.load('model.pkl')
# 3. Define a prediction endpoint
@app.route('/predict', methods=['POST'])
def predict():
"""
Receives a POST request with JSON data,
runs the model, and returns the prediction.
"""
try:
# Get JSON data from the request
data = request.get_json()
# The 'features' key should contain a list of numbers
features = data['features']
# Run prediction
prediction = model.predict([features])
# Return the result as JSON
return jsonify({'prediction': prediction[0]})
except Exception as e:
# Handle errors, e.g., bad input format
return jsonify({'error': str(e)}), 400
# 4. Start the Flask server
if __name__ == '__main__':
# The server will be accessible on http://0.0.0.0:5000
app.run(host='0.0.0.0', port=5000)
The Dependencies (requirements.txt)
Our application depends on scikit-learn (for the model) and Flask (for the web server). We must list these dependencies in a requirements.txt file so Docker knows what to install.
Create a file named requirements.txt:
scikit-learn==1.0.2
Flask==2.1.2
joblib==1.1.0
Note: Using specific versions is a good practice for ensuring reproducibility. Your application will always be built with these exact library versions, preventing unexpected behavior from library updates.
The Dockerfile is a text file that contains a set of instructions for building a Docker image. It's like a recipe for creating our application's environment.
Create a file named Dockerfile (with no extension) in your project directory and add the following instructions. We will explain each line below.
# Use an official Python runtime as a parent image
FROM python:3.9-slim
# Set the working directory in the container
WORKDIR /app
# Copy the requirements file first to leverage Docker cache
COPY requirements.txt .
# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application's code
COPY . .
# Expose the port the app runs on
EXPOSE 5000
# Define the command to run the application
CMD ["python", "app.py"]
Let's break down what each instruction does:
FROM python:3.9-slim: Every Docker image starts from a base image. Here, we use an official Python 3.9 image. The -slim variant is a smaller version, which results in a more lightweight final image.WORKDIR /app: This sets the working directory inside the container to /app. All subsequent commands (COPY, RUN, CMD) will be executed from this directory.COPY requirements.txt .: We copy the requirements.txt file into the container's working directory. We do this before copying the rest of the code to take advantage of Docker's layer caching. If our Python code changes but the requirements do not, Docker can reuse the unchanged layers, making subsequent builds much faster.RUN pip install --no-cache-dir -r requirements.txt: This command executes pip install inside the image, installing all the libraries listed in our requirements.txt file. The --no-cache-dir flag keeps the image size smaller by not storing the download cache.COPY . .: This copies all remaining files from our local project directory (the build context) into the container's working directory (/app). This includes app.py and model.pkl.EXPOSE 5000: This instruction informs Docker that the container listens on the specified network port at runtime. It's primarily for documentation and does not actually publish the port.CMD ["python", "app.py"]: This specifies the default command to run when a container is started from this image. In our case, it starts the Flask web server by running python app.py.The process of using a
Dockerfileto build a staticImage, which is then used to launch one or more runningContainers.
Now that we have our Dockerfile, we can use it to build the image. Open your terminal or command prompt, navigate to your project directory, and run the following command:
docker build -t my-ml-app .
Let's dissect this command:
docker build: The command to build an image from a Dockerfile.-t my-ml-app: The -t flag stands for "tag." It applies a name to our image, making it easy to reference later. We've named it my-ml-app..: This final dot tells Docker to look for the Dockerfile in the current directory. This directory is also the "build context", all the files here are sent to the Docker daemon for the build process.Docker will now execute the instructions in your Dockerfile step by step. You will see output for each layer being built.
Once the build is complete, you have a Docker image named my-ml-app on your local machine. You can see it by running docker images.
Now, let's run a container from this image:
docker run -p 5001:5000 my-ml-app
Here is what the command does:
docker run: The command to start a new container.-p 5001:5000: This is the port mapping flag. It maps port 5001 on your local machine (the host) to port 5000 inside the container. Our Flask app listens on port 5000 inside the container, and this mapping makes it accessible from our host machine on port 5001.my-ml-app: The name of the image to create the container from.You should see output from the Flask server, indicating that it has started and is listening on http://0.0.0.0:5000.
Your model is now running inside an isolated container and is accessible on your local machine. Let's send it a prediction request.
Open a new terminal window (leave the one running the container open) and use a tool like curl to send a POST request:
curl -X POST \
-H "Content-Type: application/json" \
-d '{"features": [1.5, 2.5, 3.5, 4.5]}' \
http://localhost:5001/predict
If everything is working correctly, you will receive a JSON response from your model, something like this:
{"prediction":15.5}
Note: The exact prediction value depends on the
model.pklfile you are using. The important part is receiving a valid JSON response.
Congratulations! You have successfully packaged a machine learning model and a web application into a Docker container and served a prediction. This container is now a portable artifact. You can push it to a container registry and run it on any machine with Docker installed, confident that the environment and dependencies will be exactly the same.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with