docker run
docker-compose.yml
As you build Docker images for your Machine Learning projects, you'll often need ways to configure the container's behavior without modifying the underlying code or rebuilding the image every time. Environment variables provide a standard mechanism for passing configuration settings into your containers. They are indispensable for parameterizing your ML environments, influencing script execution, and managing sensitive information.
Environment variables are dynamic named values that can affect the way running processes behave on a computer. Within a Docker container, they function similarly, providing a way to inject configuration from the outside world (the Docker host or orchestration system) into the isolated container environment.
ENV
InstructionThe primary way to define environment variables within your Dockerfile
is using the ENV
instruction. This sets variables that will be available both during the subsequent steps of the image build process and when containers are run from the resulting image.
The ENV
instruction has two forms:
ENV <key>=<value>
: Sets a single variable. If the value contains spaces, enclose it in quotes or escape the spaces with backslashes.ENV <key1>=<value1> <key2>=<value2> ...
: Sets multiple variables in a single instruction. This form is often preferred as it creates fewer image layers.Let's see an example within an ML context:
# Dockerfile
FROM python:3.9-slim
# Set default logging level and model directory
ENV LOG_LEVEL=INFO \
MODEL_DIR=/app/models
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# This script might use the MODEL_DIR variable
CMD ["python", "train.py"]
In this Dockerfile
, LOG_LEVEL
is set to INFO
and MODEL_DIR
is set to /app/models
. Any scripts running inside the container (like train.py
) can access these values, typically using standard library functions available in the programming language (e.g., os.environ.get('MODEL_DIR')
in Python). Setting defaults like this makes your image more self-contained and predictable.
ARG
Docker also provides the ARG
instruction to define variables that users can pass at build-time using the --build-arg
flag with the docker build
command.
# Dockerfile
ARG PYTHON_VERSION=3.9
FROM python:${PYTHON_VERSION}-slim
ARG CACHE_DATE=not_set
ENV LAST_REFRESHED_AT=$CACHE_DATE
# ... rest of the Dockerfile
Here, PYTHON_VERSION
is a build argument with a default value of 3.9
. You could build this image using a different Python version like this:
docker build --build-arg PYTHON_VERSION=3.10 -t my-ml-app .
An important distinction exists between ARG
and ENV
:
ARG
variables are generally only available during the image build process. They do not persist as environment variables in the final image or running containers by default.ENV
variables are available during the build and persist in the final image, accessible by applications running inside containers launched from that image.You can, however, make a build argument persist as an environment variable by defining an ENV
instruction that uses the ARG
variable, as shown with LAST_REFRESHED_AT
in the example above. This pattern is useful for embedding build-time information into the runtime environment.
Use ARG
for parameters that affect the build itself (like base image versions, source repositories) or for temporary build secrets you don't want baked into the final image layer's metadata. Use ENV
for runtime configuration defaults needed by your application.
One of the significant benefits of environment variables is the ability to override the defaults set in the Dockerfile
when you launch a container. This provides flexibility without needing to rebuild the image. The docker run
command accepts the -e
(or --env
) flag for this purpose.
# Run container with default settings
docker run my-ml-app
# Override the LOG_LEVEL for debugging
docker run -e LOG_LEVEL=DEBUG my-ml-app
# Override the model directory
docker run -e MODEL_DIR=/data/production_model my-ml-app
# Set a new variable not defined in the Dockerfile
docker run -e API_KEY=xyz123 my-ml-app
You can use the -e
flag multiple times to set or override several variables. This runtime configuration is fundamental for adapting a generic ML image to specific tasks, environments (development, staging, production), or datasets.
Managing many environment variables via -e
flags on the command line can become cumbersome. Docker allows you to place these variables in a file (conventionally named .env
) and pass that file during container startup using the --env-file
flag.
An example .env
file:
# .env file
LOG_LEVEL=INFO
MODEL_DIR=/data/shared/models
DATABASE_URL=postgresql://user:pass@dbhost:5432/ml_results
S3_BUCKET=my-ml-artifacts
Each line follows the KEY=VALUE
format. Comments start with #
.
You can then run the container like this:
docker run --env-file .env my-ml-app
Docker reads the variables from the .env
file and sets them in the container environment. Variables set explicitly with -e
will override those defined in the .env
file if there are conflicts. Using .env
files is a good practice for managing configuration, especially for separating settings between different deployment environments. It also helps keep sensitive information out of shell history or command-line logs.
Environment variables are widely used in containerized ML workflows:
DATA_DIR
), output models (MODEL_DIR
), logs (LOG_DIR
), or temporary files (TMP_DIR
).LEARNING_RATE
, BATCH_SIZE
) can sometimes be set via ENV
, especially for inference servers.Dockerfile
using ENV
. Prefer runtime injection using -e
, --env-file
, or dedicated secrets management systems in production.LOG_LEVEL
, enabling/disabling features (ENABLE_GPU=true
), or specifying ML framework backends (KERAS_BACKEND=tensorflow
).DATABASE_HOST=db
, REDIS_PORT=6379
).Dockerfile
using ENV
. These become part of the image layers and can be inspected. Use runtime injection (-e
, --env-file
) or proper secrets management tools for production workloads.ENV
in your Dockerfile
to provide reasonable default values for common configurations. This makes the image easier to use out-of-the-box.ARG
and ENV
: Use ARG
for build-time customization and ENV
for runtime configuration.README.md
file accompanying the Dockerfile
.Environment variables are a simple yet effective mechanism for configuring your containerized ML applications. By understanding how to use ENV
, ARG
, -e
, and --env-file
, you can create flexible, reusable Docker images that adapt to various deployment scenarios without requiring constant code changes or image rebuilds. This promotes consistency and simplifies the management of your ML environments.
© 2025 ApX Machine Learning