Now that we've discussed structuring your LLM application code, managing secrets, estimating costs, caching, and testing, let's address how to package your application for deployment. Containerization, primarily using Docker, is a standard practice that helps ensure your application runs consistently across different environments, from your development machine to staging and production servers.
This practical exercise guides you through containerizing a simple LLM application, similar to the Q&A bot or agent you might have built in previous chapters.
Prerequisites
Before you start, ensure you have:
Docker Installed: You need Docker Desktop (for Mac/Windows) or Docker Engine (for Linux) installed and running on your system. You can find installation instructions on the official Docker website.
A Simple LLM Application: Have a basic Python application ready. This could be a simple Flask or FastAPI web server with an endpoint that accepts a user query, interacts with an LLM API (like OpenAI's), and returns the result. Assume your application structure looks something like this:
llm_app/
├── app.py # Your main application script (e.g., Flask/FastAPI)
├── requirements.txt # Lists Python dependencies (e.g., openai, flask)
└── .env # Optional: File to store environment variables locally (should NOT be committed)
Your requirements.txt
might contain:
flask
openai
python-dotenv
Your app.py
would typically load the API key from environment variables and define a route (e.g., /ask
) to handle requests.
Step 1: Create the Dockerfile
A Dockerfile
is a text file containing instructions Docker uses to build an image of your application. Create a file named Dockerfile
(no extension) in the root directory of your llm_app
project.
Add the following content to your Dockerfile
:
# Use an official Python runtime as a parent image
FROM python:3.10-slim
# Set the working directory in the container
WORKDIR /app
# Copy the requirements file into the container at /app
COPY requirements.txt .
# Install any needed packages specified in requirements.txt
# Use --no-cache-dir to reduce image size
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application code into the container at /app
COPY . .
# Make port 5000 available to the world outside this container (if using Flask default)
# Adjust if your app uses a different port
EXPOSE 5000
# Define environment variable for the API key (placeholder)
# IMPORTANT: This is NOT secure for production. See notes below.
ENV OPENAI_API_KEY=your_default_or_placeholder_key
# Command to run the application when the container launches
# Adjust this command based on how you run your app (e.g., uvicorn for FastAPI)
CMD ["python", "app.py"]
Understanding the Dockerfile Instructions:
FROM python:3.10-slim
: Specifies the base image. We're using a slim version of the official Python 3.10 image.WORKDIR /app
: Sets the default directory for subsequent instructions inside the container.COPY requirements.txt .
: Copies only the requirements file first. Docker caches layers, so if requirements.txt
doesn't change, the pip install
step won't need to rerun on subsequent builds, speeding things up.RUN pip install ...
: Executes the command to install dependencies.COPY . .
: Copies the rest of your application code (like app.py
) into the /app
directory in the container.EXPOSE 5000
: Informs Docker that the container listens on port 5000 at runtime. This doesn't actually publish the port; it's more like documentation.ENV OPENAI_API_KEY=...
: Sets an environment variable inside the container image. Warning: Hardcoding secrets directly like this or even setting them via ENV
is generally insecure and bad practice for production. It makes the secret visible in the image layers. We do it here for simplicity, but in real applications, you should inject secrets at runtime (as shown in Step 3) or use secret management tools. Refer back to the "Managing API Keys and Secrets" section for better approaches.CMD ["python", "app.py"]
: Specifies the default command to execute when a container is started from this image.Step 2: Build the Docker Image
Open your terminal or command prompt, navigate to the directory containing your Dockerfile
and application code (llm_app/
), and run the build command:
docker build -t my-llm-app .
docker build
: The command to build an image.-t my-llm-app
: Tags the image with a name (my-llm-app
) and optionally a tag (defaulting to latest
). This makes it easier to refer to the image later..
: Specifies the build context (the current directory), which includes the Dockerfile
and your application code.Docker will execute the instructions in your Dockerfile
step by step, downloading the base image and creating layers for each instruction.
Step 3: Run the Docker Container
Once the image is built successfully, you can run a container based on it:
docker run -p 5001:5000 --env OPENAI_API_KEY="YOUR_ACTUAL_API_KEY" --name llm_container my-llm-app
docker run
: The command to create and start a container.-p 5001:5000
: Maps port 5001 on your host machine to port 5000 inside the container (the port your app is listening on, as specified by EXPOSE
and likely your Flask/FastAPI setup). You can now access your app via http://localhost:5001
.--env OPENAI_API_KEY="YOUR_ACTUAL_API_KEY"
: This is a more secure way to provide the API key than baking it into the image with ENV
. It sets the environment variable only for this specific container instance at runtime. Replace "YOUR_ACTUAL_API_KEY"
with your real key.--name llm_container
: Assigns a recognizable name to the running container.my-llm-app
: The name of the image to run.If you want the container to run in the background (detached mode), add the -d
flag:
docker run -d -p 5001:5000 --env OPENAI_API_KEY="YOUR_ACTUAL_API_KEY" --name llm_container my-llm-app
You can view logs using docker logs llm_container
. To stop the container, use docker stop llm_container
. To remove it, use docker rm llm_container
.
Step 4: Test the Containerized Application
With the container running, open a new terminal or use a tool like Postman or curl
to send a request to your application, now accessible on the host port you mapped (e.g., 5001):
# Example using curl, assuming your app has a POST endpoint at /ask
curl -X POST http://localhost:5001/ask \
-H "Content-Type: application/json" \
-d '{"query": "Explain containerization in simple terms."}'
You should receive a response generated by the LLM, served by your application running inside the Docker container.
Benefits and Next Steps
You have successfully containerized your simple LLM application! This package (the Docker image) contains your application, its dependencies, and runtime, ensuring it behaves consistently wherever Docker is available. This simplifies sharing your application with others, deploying it to cloud platforms, and managing dependencies reliably.
For more advanced scenarios, consider exploring:
.dockerignore
file (similar syntax to .gitignore
) to exclude files and directories (like .git
, __pycache__
, .env
) from being copied into the image, keeping it smaller and more secure.FROM
statements in your Dockerfile
to separate build-time dependencies from runtime dependencies, resulting in smaller final images.Containerization is a foundational skill for modern application development and deployment, including applications built with Large Language Models.
© 2025 ApX Machine Learning