docker rundocker-compose.ymlEnvironment variables provide a standard mechanism for passing configuration settings into Docker containers. This approach is particularly valuable for Machine Learning projects, allowing for dynamic container behavior configuration without modifying underlying code or rebuilding images repeatedly. They are indispensable for parameterizing ML environments, influencing script execution, and managing sensitive information.
Environment variables are dynamic named values that can affect the way running processes behave on a computer. Within a Docker container, they function similarly, providing a way to inject configuration from the outside (the Docker host or orchestration system) into the isolated container environment.
ENV InstructionThe primary way to define environment variables within your Dockerfile is using the ENV instruction. This sets variables that will be available both during the subsequent steps of the image build process and when containers are run from the resulting image.
The ENV instruction has two forms:
ENV <key>=<value>: Sets a single variable. If the value contains spaces, enclose it in quotes or escape the spaces with backslashes.ENV <key1>=<value1> <key2>=<value2> ...: Sets multiple variables in a single instruction. This form is often preferred as it creates fewer image layers.Let's see an example within an ML context:
# Dockerfile
FROM python:3.9-slim
# Set default logging level and model directory
ENV LOG_LEVEL=INFO \
MODEL_DIR=/app/models
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# This script might use the MODEL_DIR variable
CMD ["python", "train.py"]
In this Dockerfile, LOG_LEVEL is set to INFO and MODEL_DIR is set to /app/models. Any scripts running inside the container (like train.py) can access these values, typically using standard library functions available in the programming language (e.g., os.environ.get('MODEL_DIR') in Python). Setting defaults like this makes your image more self-contained and predictable.
ARGDocker also provides the ARG instruction to define variables that users can pass at build-time using the --build-arg flag with the docker build command.
# Dockerfile
ARG PYTHON_VERSION=3.9
FROM python:${PYTHON_VERSION}-slim
ARG CACHE_DATE=not_set
ENV LAST_REFRESHED_AT=$CACHE_DATE
# ... rest of the Dockerfile
Here, PYTHON_VERSION is a build argument with a default value of 3.9. You could build this image using a different Python version like this:
docker build --build-arg PYTHON_VERSION=3.10 -t my-ml-app .
An important distinction exists between ARG and ENV:
ARG variables are generally only available during the image build process. They do not persist as environment variables in the final image or running containers by default.ENV variables are available during the build and persist in the final image, accessible by applications running inside containers launched from that image.You can, however, make a build argument persist as an environment variable by defining an ENV instruction that uses the ARG variable, as shown with LAST_REFRESHED_AT in the example above. This pattern is useful for embedding build-time information into the runtime environment.
Use ARG for parameters that affect the build itself (like base image versions, source repositories) or for temporary build secrets you don't want baked into the final image layer's metadata. Use ENV for runtime configuration defaults needed by your application.
One of the significant benefits of environment variables is the ability to override the defaults set in the Dockerfile when you launch a container. This provides flexibility without needing to rebuild the image. The docker run command accepts the -e (or --env) flag for this purpose.
# Run container with default settings
docker run my-ml-app
# Override the LOG_LEVEL for debugging
docker run -e LOG_LEVEL=DEBUG my-ml-app
# Override the model directory
docker run -e MODEL_DIR=/data/production_model my-ml-app
# Set a new variable not defined in the Dockerfile
docker run -e API_KEY=xyz123 my-ml-app
You can use the -e flag multiple times to set or override several variables. This runtime configuration is fundamental for adapting a generic ML image to specific tasks, environments (development, staging, production), or datasets.
Managing many environment variables via -e flags on the command line can become cumbersome. Docker allows you to place these variables in a file (conventionally named .env) and pass that file during container startup using the --env-file flag.
An example .env file:
# .env file
LOG_LEVEL=INFO
MODEL_DIR=/data/shared/models
DATABASE_URL=postgresql://user:pass@dbhost:5432/ml_results
S3_BUCKET=my-ml-artifacts
Each line follows the KEY=VALUE format. Comments start with #.
You can then run the container like this:
docker run --env-file .env my-ml-app
Docker reads the variables from the .env file and sets them in the container environment. Variables set explicitly with -e will override those defined in the .env file if there are conflicts. Using .env files is a good practice for managing configuration, especially for separating settings between different deployment environments. It also helps keep sensitive information out of shell history or command-line logs.
Environment variables are widely used in containerized ML workflows:
DATA_DIR), output models (MODEL_DIR), logs (LOG_DIR), or temporary files (TMP_DIR).LEARNING_RATE, BATCH_SIZE) can sometimes be set via ENV, especially for inference servers.Dockerfile using ENV. Prefer runtime injection using -e, --env-file, or dedicated secrets management systems in production.LOG_LEVEL, enabling/disabling features (ENABLE_GPU=true), or specifying ML framework backends (KERAS_BACKEND=tensorflow).DATABASE_HOST=db, REDIS_PORT=6379).Dockerfile using ENV. These become part of the image layers and can be inspected. Use runtime injection (-e, --env-file) or proper secrets management tools for production workloads.ENV in your Dockerfile to provide reasonable default values for common configurations. This makes the image easier to use out-of-the-box.ARG and ENV: Use ARG for build-time customization and ENV for runtime configuration.README.md file accompanying the Dockerfile.Environment variables are a simple yet effective mechanism for configuring your containerized ML applications. By understanding how to use ENV, ARG, -e, and --env-file, you can create flexible, reusable Docker images that adapt to various deployment scenarios without requiring constant code changes or image rebuilds. This promotes consistency and simplifies the management of your ML environments.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with