Practice: Containerize Flask ML API with Docker

Okay, let's put theory into practice. In the previous sections, we discussed containerization concepts and Docker basics. Now, we'll combine what we've learned to package the Flask prediction service we built in Chapter 3 into a portable Docker container. This process will ensure our application runs predictably across different environments.

Make sure you have the following files from Chapter 3 ready in a single project directory:

app.py: Your Flask application code that loads the model and defines the prediction endpoint.
model.joblib (or .pkl): The saved machine learning model file.
requirements.txt: A file listing the Python libraries needed for your application (like flask, scikit-learn, numpy, joblib). If you don't have one, create it now. It should contain at least:
```
flask
joblib
scikit-learn
numpy
# Add any other libraries your specific model or app needs
```
Optionally, any preprocessing artifacts (like scalers or encoders) you saved in Chapter 2 should also be in this directory if your app.py needs them.

Our goal is to create a Docker image that contains the Python runtime, our application code, the model file, and all necessary libraries defined in requirements.txt. We'll then run this image as a container.

Step 1: Create the Dockerfile

In your project directory, create a new file named Dockerfile (no file extension). Open this file in your text editor and add the following instructions:

# Use an official Python runtime as a parent image
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the requirements file into the container at /app
COPY requirements.txt .

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code and model into the container at /app
COPY . .

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Define environment variable for Flask
ENV FLASK_APP=app.py

# Run app.py when the container launches using Flask's built-in server
CMD ["flask", "run", "--host=0.0.0.0"]

Let's break down what each instruction does:

FROM python:3.9-slim: This specifies the base image for our container. We're starting with a minimal official Python 3.9 image (slim version). Docker will download this if you don't have it locally.
WORKDIR /app: This sets the working directory inside the container to /app. Subsequent commands like COPY and RUN will be executed relative to this directory.
COPY requirements.txt .: This copies the requirements.txt file from your project directory on your host machine into the container's working directory (/app). The . refers to the current working directory inside the container.
RUN pip install --no-cache-dir -r requirements.txt: This command runs pip install inside the container to install all the libraries listed in requirements.txt. The --no-cache-dir flag is often used in Docker builds to reduce the image size by not storing the download cache.
COPY . .: This copies everything else from your project directory (your app.py, model.joblib, etc.) into the container's working directory (/app). We copy requirements.txt first and install dependencies before copying the rest of the code; this optimizes Docker's layer caching. If your code doesn't change but requirements do, Docker only needs to re-run the pip install step.
EXPOSE 5000: This informs Docker that the container will listen on port 5000 at runtime. This doesn't actually publish the port; it's more like documentation within the image. The Flask development server defaults to port 5000.
ENV FLASK_APP=app.py: Sets an environment variable inside the container, telling Flask which file to run.
CMD ["flask", "run", "--host=0.0.0.0"]: This specifies the command to run when a container is started from this image. We use flask run to start the development server. The --host=0.0.0.0 part is important; it makes the server listen on all available network interfaces within the container, allowing us to connect to it from outside.

Step 2: Build the Docker Image

Now that we have our Dockerfile, we can build the image. Open your terminal or command prompt, navigate to your project directory (the one containing the Dockerfile, app.py, etc.), and run the following command:

docker build -t ml-prediction-service .

Let's understand this command:

docker build: The command to build an image from a Dockerfile.
-t ml-prediction-service: The -t flag allows you to "tag" the image with a name (and optionally a tag like :latest or :v1). We're naming our image ml-prediction-service. This makes it easier to refer to later.
.: This tells Docker to look for the Dockerfile in the current directory.

Docker will now execute the instructions in your Dockerfile step by step. You'll see output showing it downloading the base image (if needed), installing packages, and copying files. This might take a minute or two the first time, especially during the pip install step. Subsequent builds might be faster due to Docker's caching mechanism.

If the build completes successfully, you've created your Docker image! You can verify this by listing your local Docker images:

docker images

You should see ml-prediction-service listed among them.

Step 3: Run the Docker Container

With the image built, we can now run it as a container. Execute this command in your terminal:

docker run -p 5000:5000 ml-prediction-service

Let's examine the docker run command:

docker run: The command to start a new container from an image.
-p 5000:5000: This is the port mapping flag. It maps port 5000 on your host machine (the first 5000) to port 5000 inside the container (the second 5000). Remember, EXPOSE 5000 in the Dockerfile only documented the port; -p actually makes the connection. This allows you to access the Flask app running inside the container via localhost:5000 on your machine.
ml-prediction-service: The name of the image we want to run.

You should see output from Flask indicating that the server is running, similar to when you ran it directly in Chapter 3, but now it's running inside the isolated container environment.

 * Serving Flask app 'app.py'
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on all addresses (0.0.0.0)
   WARNING: This is a development server. Do not use it in a production deployment.
 * Running on http://127.0.0.1:5000
 * Running on http://[::]:5000 (Press CTRL+C to quit)

Your terminal will be attached to the running container, showing its logs.

Step 4: Test the Containerized Service

Just like in Chapter 3, you can test your API. Open a new terminal window (leaving the container running in the first one) and use curl or a Python script with the requests library to send a POST request to http://localhost:5000/predict (or whatever endpoint you defined).

Using curl (replace the example data with valid input for your model):

curl -X POST -H "Content-Type: application/json" \
     -d '{"features": [5.1, 3.5, 1.4, 0.2]}' \
     http://localhost:5000/predict

You should receive the JSON prediction back from the service, just as before. The difference is that the service is now running inside a standardized, isolated Docker container.

Step 5: Stopping the Container

To stop the container running in the foreground, go back to the terminal where it's running and press CTRL+C.

If you want to run the container in the background (detached mode), you can add the -d flag:

docker run -d -p 5000:5000 ml-prediction-service

This command will start the container and print its unique ID. It will continue running in the background. To see running containers, use:

docker ps

To stop a detached container, use docker stop <container_id_or_name>:

docker stop <output_container_id_from_docker_ps>

Congratulations! You have successfully containerized your Flask prediction service using Docker. You now have a Dockerfile that defines the environment and a Docker image that bundles your application and dependencies. This image can be shared and run consistently on any machine with Docker installed, significantly simplifying the deployment process. This is a fundamental step towards making your machine learning models reliably available in various environments.