Okay, you've successfully packaged your Flask prediction service and all its dependencies into a Docker image. Think of this image as a self-contained blueprint. Now, it's time to bring that blueprint to life by running it as a container. A container is a live, running instance of your image, an isolated environment where your application executes.
docker run
The fundamental command to start a container from an image is docker run
. Its basic structure looks like this:
docker run [OPTIONS] IMAGE[:TAG] [COMMAND] [ARG...]
OPTIONS
: Flags that modify the container's behavior (we'll cover important ones shortly).IMAGE[:TAG]
: The name of the image you want to run. If you followed the previous section, this might be something like prediction-service
. Docker defaults to the :latest
tag if none is specified.COMMAND
and ARG...
: Optional commands and arguments to execute inside the container, overriding the default command specified in the Dockerfile (like the CMD ["python", "app.py"]
instruction).Let's try running the container for our Flask application. If your image is named prediction-service
, a simple execution would be:
# This will likely run in the foreground and print Flask logs to your terminal
docker run prediction-service
You'll probably see output from Flask indicating the server has started. However, running it this way occupies your terminal. To stop it, you'd press Ctrl+C
. For a web service, we typically want it to run in the background.
To run the container in the background (detached mode), use the -d
option:
docker run -d prediction-service
This command starts the container and immediately returns your terminal prompt, printing a long container ID. But how do we interact with the Flask application running inside the container? By default, container ports are not accessible from your host machine. We need to map them.
Our Flask application inside the container is listening on a specific port (likely port 5000, as configured in the Flask app or Dockerfile). To access it from our host machine's browser or tools like curl
, we need to map a port on the host to the container's port. This is done using the -p
or --publish
option, formatted as host_port:container_port
.
For example, to map port 5000 on your host machine to port 5000 inside the container, you'd use -p 5000:5000
.
This diagram illustrates how the
-p
flag connects a port on your host machine to the port where the Flask application is running inside the container.
Docker assigns random names to containers (like focused_turing
). While functional, these aren't very memorable. You can assign a specific name using the --name
option, which makes managing the container easier.
Let's combine these options to run our Flask service container:
-d
).-p 5000:5000
).my-prediction-app
(--name my-prediction-app
).prediction-service
).docker run -d -p 5000:5000 --name my-prediction-app prediction-service
If successful, Docker will print the container ID, and the container will be running in the background.
How do you know it's actually running? The docker ps
command lists all currently running containers:
docker ps
You should see output similar to this (details might vary):
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a1b2c3d4e5f6 prediction-service "python app.py" 15 seconds ago Up 14 seconds 0.0.0.0:5000->5000/tcp my-prediction-app
This confirms:
my-prediction-app
) is running.prediction-service
image.0.0.0.0:5000
) is mapped to port 5000 inside the container.You can also check the logs produced by the application inside the container using docker logs
:
docker logs my-prediction-app
This should show the startup messages from your Flask application.
Now that the container is running and the port is mapped, you can send requests to your prediction service just like you did when testing locally, but targeting localhost
on the host port you published (5000 in our example).
Using curl
(assuming your API expects JSON input at a /predict
endpoint):
curl -X POST -H "Content-Type: application/json" \
-d '{"features": [1.0, 2.5, 3.0, 4.5]}' \
http://localhost:5000/predict
Or using Python's requests
library:
import requests
import json
# Example input data (adjust according to your model's needs)
data = {"features": [1.0, 2.5, 3.0, 4.5]}
# URL of the service running in Docker
url = "http://localhost:5000/predict" # Use the host port
try:
response = requests.post(url, json=data)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
prediction = response.json()
print(f"Prediction response: {prediction}")
except requests.exceptions.RequestException as e:
print(f"Error connecting to the service: {e}")
except json.JSONDecodeError:
print(f"Could not decode JSON response: {response.text}")
You should receive the prediction output from your model, served by the Flask app running entirely within the isolated Docker container!
Once you are finished experimenting, you can stop and remove the container to free up resources.
Stopping the Container: Use the docker stop
command followed by the container's name or ID:
docker stop my-prediction-app
Removing the Container: A stopped container still exists. To remove it completely, use docker rm
:
docker rm my-prediction-app
You can only remove stopped containers. If you want to stop and remove in one go, you can use the -f
(force) flag with docker rm
, but it's generally better practice to stop it first.
To see all containers, including stopped ones, use docker ps -a
. Regularly removing stopped containers you no longer need is good housekeeping.
You have now successfully run your machine learning prediction service inside a Docker container, making it portable and ensuring a consistent runtime environment. This is a foundational step in deploying applications reliably.
© 2025 ApX Machine Learning