docker run
docker-compose.yml
When deploying machine learning models for inference, the size of the Docker image becomes a significant factor. Larger images take longer to download, consume more storage, and can potentially increase the attack surface by including unnecessary build tools or libraries. A common cause for large images is the inclusion of build-time dependencies, such as compilers, headers, or even entire SDKs, that are needed to install libraries but are not required to run the final application.
Multi-stage builds offer an effective solution to this problem. They allow you to use multiple FROM
instructions within a single Dockerfile. Each FROM
instruction can use a different base image and begins a new "stage" of the build. You can selectively copy artifacts, like compiled code, installed dependencies, or model files, from one stage to another, leaving behind everything you don't need in the final image.
The strategy involves creating distinct stages for building and running the application:
AS <stage_name>
(e.g., AS builder
).python:3.9-slim
or a distroless image). It contains only the components essential for running your inference service.COPY --from=<stage_name>
instruction in the final stage. This command copies specific files or directories from the designated earlier stage (e.g., builder
) into the final stage's filesystem.This process ensures that the final image only includes the runtime application, its direct dependencies, and any necessary data files, discarding the intermediate build environment and its associated bloat.
Consider an inference service built with FastAPI that requires libraries which might have complex build dependencies. A multi-stage Dockerfile could look like this:
# Stage 1: Builder stage with build tools
FROM python:3.9 AS builder
WORKDIR /app
# Install build essentials if needed (example for Debian-based images)
# RUN apt-get update && apt-get install -y --no-install-recommends build-essential
# Copy only requirements first to leverage Docker cache
COPY requirements.txt .
# Install all dependencies, including build-time ones
# Using a virtual environment within the build stage can help isolate packages
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application code
COPY . .
# Optional: Download model artifacts if they aren't copied with the code
# RUN python download_model.py
# Stage 2: Final stage with a slim runtime environment
FROM python:3.9-slim
WORKDIR /app
# Copy the virtual environment from the builder stage
COPY --from=builder /opt/venv /opt/venv
# Copy the application code (adjust path if needed)
COPY --from=builder /app /app
# Copy model files if they were downloaded or part of the app code in builder
# COPY --from=builder /app/models /app/models
# Make port 80 available to the world outside this container
EXPOSE 80
# Define environment variable
ENV NAME World
# Set the path to include the virtual environment's binaries
ENV PATH="/opt/venv/bin:$PATH"
# Run the application
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]
In this example:
builder
stage uses the standard python:3.9
image, installs dependencies from requirements.txt
into a virtual environment (/opt/venv
), and copies the application code.python:3.9-slim
image.COPY --from=builder /opt/venv /opt/venv
copies the entire virtual environment, containing only the installed Python packages, from the builder stage to the final stage.COPY --from=builder /app /app
copies the application source code.builder
stage.Diagram illustrating copying specific artifacts (virtual environment, application code) from a larger build stage to a smaller final runtime stage in a multi-stage Docker build.
COPY --from
. This might include installed packages (like the site-packages
directory or a virtual environment), compiled binaries, static assets, and model files. Inspecting the filesystem of an intermediate build stage container can be helpful (docker run --rm -it <builder_image_id> bash
).slim
, alpine
, or distroless
).Multi-stage builds are a standard practice for creating production-ready container images, particularly important for ML inference services where efficiency and security are significant concerns. By separating build-time requirements from runtime necessities, you create leaner, faster, and more secure containers for serving your models.
© 2025 ApX Machine Learning