docker run
docker-compose.yml
Building upon the understanding of individual services, volumes, and networks within Docker Compose, let's examine how these elements combine to define common application stacks encountered in Machine Learning projects. Using a docker-compose.yml
file allows us to declaratively define and manage these multi-component systems efficiently for local development and testing, mirroring more complex deployment setups.
A frequent pattern involves deploying a trained model as an API endpoint for making predictions. Often, this API needs to interact with a database to store prediction logs, retrieve user information, or manage other application state. Docker Compose makes setting up this API-plus-database stack straightforward.
Consider an application consisting of:
api
): A custom container running a web framework (like Flask or FastAPI) that loads a trained model and exposes prediction endpoints. It needs access to the database.db
): A standard database container (like PostgreSQL or MySQL) where the API service can read/write data. Its data needs to persist across container restarts.A diagram showing a user request hitting the API service, which communicates with the Database service over a shared Docker network. The database service uses a volume for data persistence.
Here’s a simplified docker-compose.yml
representing this stack:
version: '3.8' # Specify Compose file version
services:
api:
build: ./api # Path to the directory containing the API's Dockerfile
ports:
- "5000:5000" # Map host port 5000 to container port 5000
volumes:
- ./api:/app # Mount local API code into the container (for development)
- ./models:/app/models # Mount models directory
environment:
- DATABASE_URL=postgresql://user:password@db:5432/mydatabase
# Other API specific config...
depends_on:
- db # Wait for the db service to be healthy (if healthcheck is defined)
networks:
- app-net
db:
image: postgres:14-alpine # Use an official PostgreSQL image
volumes:
- postgres_data:/var/lib/postgresql/data # Mount named volume for data persistence
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
- POSTGRES_DB=mydatabase
networks:
- app-net
# Optional: Add a healthcheck for depends_on
# healthcheck:
# test: ["CMD-SHELL", "pg_isready -U user -d mydatabase"]
# interval: 10s
# timeout: 5s
# retries: 5
volumes:
postgres_data: # Define the named volume
networks:
app-net: # Define the custom network
driver: bridge
Key takeaways from this example:
api
and db
.api
service is built from a local Dockerfile
using build: ./api
, while the db
service uses a pre-built image from Docker Hub (image: postgres:14-alpine
).app-net
. This allows the api
service to connect to the database using the hostname db
(the service name) and the standard PostgreSQL port 5432
. The connection string postgresql://user:password@db:5432/mydatabase
demonstrates this../api:/app
) is used for the api
service, allowing code changes on the host to be reflected immediately in the container during development. A named volume (postgres_data
) is used for the db
service to ensure that the database files persist even if the db
container is removed and recreated.environment
is used to pass configuration. The db
service uses them to initialize the database (user, password, name). The api
service uses them to know how to connect to the database.ports
mapping - "5000:5000"
makes the API accessible from the host machine on port 5000.depends_on
: This ensures that the db
service is started before the api
service attempts to connect, although it doesn't guarantee the database inside the container is fully ready without a healthcheck
.Running docker compose up
in the directory containing this file will build the API image (if needed), pull the PostgreSQL image, create the network and volume, and start both containers.
Another common scenario involves running a training script within a container while logging metrics, parameters, and artifacts to an experiment tracking server like MLflow. Docker Compose can manage the training container, the MLflow tracking server, and potentially a backend database for MLflow.
Components:
trainer
): A container that runs the ML training script. It needs to know the URI of the MLflow server.mlflow
): Runs the MLflow tracking server UI and API.db
): (Optional but recommended for robustness) A database (e.g., PostgreSQL) used by MLflow to store experiment metadata.A diagram showing the Training service sending logs to the MLflow Server over the network. The MLflow Server stores metadata in the Backend DB and reads/writes artifacts to a shared volume/mount. A developer accesses the MLflow UI.
A possible docker-compose.yml
could look like this:
version: '3.8'
services:
trainer:
build: ./training # Path to training script's Dockerfile context
command: python train.py --data /data/input.csv --model-output /artifacts
volumes:
- ./training/src:/app # Mount training code
- ./data:/data # Mount input data
- mlflow_artifacts:/artifacts # Mount artifact volume
environment:
- MLFLOW_TRACKING_URI=http://mlflow:5001 # Tell script where MLflow server is
depends_on:
mlflow:
condition: service_started # Basic dependency
db:
condition: service_healthy # Wait for DB (requires healthcheck)
networks:
- ml-net
mlflow:
# Consider using an official or well-maintained MLflow image,
# or build your own if customization is needed.
# This example assumes a pre-built image exists that takes these args.
image: ghcr.io/mlflow/mlflow:v2.10.0
command: >
mlflow server
--backend-store-uri postgresql://mlflow_user:mlflow_pass@db:5432/mlflow_db
--default-artifact-root /mlflow_artifacts
--host 0.0.0.0
--port 5001
ports:
- "5001:5001" # Expose MLflow UI port
volumes:
- mlflow_artifacts:/mlflow_artifacts # Share artifact volume
depends_on:
db:
condition: service_healthy # Wait for DB
networks:
- ml-net
db:
image: postgres:14-alpine
volumes:
- mlflow_db_data:/var/lib/postgresql/data
environment:
- POSTGRES_USER=mlflow_user
- POSTGRES_PASSWORD=mlflow_pass
- POSTGRES_DB=mlflow_db
networks:
- ml-net
healthcheck: # Essential for depends_on condition: service_healthy
test: ["CMD-SHELL", "pg_isready -U mlflow_user -d mlflow_db"]
interval: 10s
timeout: 5s
retries: 5
volumes:
mlflow_db_data:
mlflow_artifacts: # Volume for storing model artifacts, parameters etc.
networks:
ml-net:
driver: bridge
In this setup:
trainer
service runs the training script (train.py
). It receives the location of the MLflow server (http://mlflow:5001
) via the MLFLOW_TRACKING_URI
environment variable. The MLflow client library within the script uses this URI to connect.mlflow
service runs the tracking server. Its command
specifies the PostgreSQL database (db
service) as the --backend-store-uri
and a shared volume (mlflow_artifacts
) as the --default-artifact-root
.db
service provides the PostgreSQL database for MLflow metadata persistence, using its own named volume (mlflow_db_data
). A healthcheck
is added so other services can reliably wait for it using depends_on
.mlflow_artifacts
is mounted by both the trainer
(to write artifacts) and mlflow
(to read/serve artifacts).depends_on
with condition: service_healthy
ensures the trainer
and mlflow
services wait for the db
to be ready.These examples illustrate how Docker Compose defines interconnected services typical in ML development. You can extend these patterns by adding services for data preprocessing, message queues (like Redis or RabbitMQ) for asynchronous tasks, monitoring tools, or different types of databases, all managed within a single docker-compose.yml
file. This approach significantly simplifies the setup and management of your local development environment, making it closely resemble a potential production deployment structure.
© 2025 ApX Machine Learning