docker run
docker-compose.yml
As we shift from managing single containers to orchestrating multi-container applications with Docker Compose, the need for effective configuration becomes even more apparent. Services within your stack, such as an inference API needing database credentials or a training job requiring specific hyperparameters, often need external settings to function correctly. Manually passing these via complex docker run
commands for each container quickly becomes impractical.
Docker Compose provides a structured and convenient way to manage configuration for your services using environment variables, directly within the docker-compose.yml
file or through external files. This approach centralizes configuration, making your multi-container ML applications easier to manage, adapt, and deploy across different environments.
docker-compose.yml
The most direct way to set environment variables for a service is using the environment
key within its definition in docker-compose.yml
. You can define these variables either as a list or as a map (dictionary).
Using a List:
This is often the preferred format as it's explicit and avoids potential type interpretation issues. Each item in the list is a string in the format VARIABLE=value
.
# docker-compose.yml
version: '3.8'
services:
training_job:
build: ./training_app
image: my_training_app:latest
environment:
- LEARNING_RATE=0.01
- EPOCHS=50
- MODEL_OUTPUT_DIR=/app/outputs
- WANDB_API_KEY=${WANDB_API_KEY_HOST} # Gets value from host environment
In this example, LEARNING_RATE
, EPOCHS
, and MODEL_OUTPUT_DIR
are set directly. WANDB_API_KEY
demonstrates how you can substitute a value from the environment of the host machine running docker-compose up
. If WANDB_API_KEY_HOST
is set on your host, its value will be assigned to WANDB_API_KEY
inside the container. If it's not set on the host, the variable WANDB_API_KEY
inside the container will be empty.
Using a Map:
Alternatively, you can use a dictionary format.
# docker-compose.yml
version: '3.8'
services:
inference_api:
build: ./inference_api
image: my_inference_api:latest
environment:
MODEL_PATH: /models/latest.pkl
API_PORT: 8000
DATABASE_URL: ${DB_CONNECTION_STRING} # Gets value from host environment
ports:
- "8000:8000" # Maps host port 8000 to container port 8000
Here, MODEL_PATH
and API_PORT
are set. Note that Compose typically converts all values in the map format to strings. If your application expects a numeric type (like the port number), it will need to handle the string-to-number conversion internally. DATABASE_URL
again shows substitution from the host environment.
As seen above, Compose allows you to inject variables from the host machine's environment into your container's environment. This is extremely useful for sensitive information like API keys or database passwords, or for settings that change between development, testing, and production environments.
The syntax is ${HOST_VARIABLE}
. If HOST_VARIABLE
exists on the host, its value is used. If not, the behavior depends on whether you provide a default:
${HOST_VARIABLE}
: If not set on the host, the variable inside the container becomes an empty string.${HOST_VARIABLE:-default}
: If not set on the host, the variable inside the container gets the value default
.${HOST_VARIABLE:?error message}
: If not set on the host, Compose will stop and display the error message.Example with a default value:
# docker-compose.yml
services:
worker:
image: my_worker:latest
environment:
- QUEUE_NAME=${JOB_QUEUE:-default_queue}
- LOG_LEVEL=${LOG_LEVEL:-INFO}
If JOB_QUEUE
is not set on the host, the QUEUE_NAME
inside the container will be default_queue
. Similarly, LOG_LEVEL
defaults to INFO
.
.env
Files for Local DevelopmentHardcoding configuration, especially secrets, directly in docker-compose.yml
is not recommended. Checking such files into version control exposes sensitive data. A common practice for managing environment-specific variables, particularly during local development, is to use an environment file, typically named .env
.
Docker Compose automatically looks for a file named .env
in the directory where you run the docker-compose up
command (or in the project directory specified with the --project-directory
flag). Variables defined in this .env
file are automatically substituted into your docker-compose.yml
file wherever the ${VARIABLE}
syntax is used.
Example .env
file:
# .env (This file should typically be added to .gitignore)
DB_USER=ml_user
DB_PASSWORD=supersecretdevpassword
DB_HOST=db # Service name defined in docker-compose.yml
DB_NAME=mldatabase
API_KEY=dev-abc123xyz
Example docker-compose.yml
using .env
variables:
# docker-compose.yml
version: '3.8'
services:
api:
build: ./api
image: my_ml_api:latest
environment:
- DATABASE_USER=${DB_USER}
- DATABASE_PASSWORD=${DB_PASSWORD}
- DATABASE_HOST=${DB_HOST}
- DATABASE_NAME=${DB_NAME}
- EXTERNAL_SERVICE_KEY=${API_KEY}
depends_on:
- db
ports:
- "5000:5000"
db:
image: postgres:14-alpine
environment:
# PostgreSQL image uses these specific env vars
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_DB: ${DB_NAME}
volumes:
- db_data:/var/lib/postgresql/data
volumes:
db_data:
When you run docker-compose up
, Compose reads the .env
file, finds DB_USER=ml_user
, and substitutes ml_user
wherever ${DB_USER}
appears in the docker-compose.yml
file. This keeps your secrets out of version control while providing necessary configuration to your services. Remember to add .env
to your .gitignore
file!
env_file
While the automatic .env
loading is convenient, sometimes you need more control, perhaps using different environment files for different situations (e.g., .env.dev
, .env.prod
, .env.test
). The env_file
directive within a service definition allows you to specify one or more custom environment files.
# docker-compose.yml
version: '3.8'
services:
inference_service:
build: ./inference
image: my_predictor:latest
# Loads variables from ./config/.prod.env first,
# then ./config/.common.env, overriding duplicates
env_file:
- ./config/.prod.env
- ./config/.common.env
environment:
# Variables defined here override those from env_file
- LOG_LEVEL=INFO
Variables loaded via env_file
are set in the container's environment. If you list multiple files, they are read in order, and variables defined in later files override those from earlier files. Variables defined directly under the environment
key take precedence over those loaded via env_file
.
Understanding the order in which environment variables are applied is important:
environment
section in docker-compose.yml
.docker-compose run -e VARIABLE=value ...
(applies only to run
).env_file
directive..env
file (used for substitution in docker-compose.yml
).docker-compose.yml
).ENV
instruction in the Dockerfile.Variables set later in this list are overridden by those set earlier.
If you need an environment variable within the container to actually contain a dollar sign ($), you need to escape it using $$
in your docker-compose.yml
file.
# docker-compose.yml
services:
my_service:
image: some_image
environment:
- CONFIG_VAR=uses_a_literal_$$VAR # Becomes CONFIG_VAR=uses_a_literal_$VAR inside container
By leveraging the environment
key, host variable substitution, .env
files, and the env_file
directive, Docker Compose offers flexible and secure methods for configuring your multi-container ML applications. This allows you to separate configuration from your application code and Docker images, making your setup more portable and easier to manage across different development and deployment stages.
© 2025 ApX Machine Learning