docker rundocker-compose.ymlContainers are ephemeral by default. Any data written inside a container's filesystem is lost when the container is removed. For multi-container applications, managing persistent data like datasets, trained models, logs, or database files becomes essential. While docker run allows mounting volumes using the -v flag, Docker Compose provides a more structured and manageable way to define and attach volumes to your services.
In Docker Compose, you manage volumes in two main steps:
docker-compose.yml file under the volumes: section. This tells Compose about the volumes your application stack requires.volumes: specification specific to that service.Named volumes are the preferred mechanism for persisting data generated by and used by Docker containers. Docker manages the storage area on the host machine, and you only need to refer to the volume by its name.
To declare a named volume, add a top-level volumes: section to your docker-compose.yml:
version: '3.8' # Or a later version
services:
# ... your service definitions go here ...
volumes:
postgres_data: # Declares a named volume called 'postgres_data'
ml_models: # Declares another named volume called 'ml_models'
In this example, we've declared two named volumes: postgres_data and ml_models. Compose will create these volumes automatically the first time you run docker-compose up if they don't already exist on your Docker host. You can also specify driver options here if needed, but the default local driver is usually sufficient.
Once a volume is declared, you can mount it into one or more services. Under the specific service definition, use the volumes: key (note: this is different from the top-level key). The syntax is typically volume-name:/path/in/container.
Let's extend the previous example. Imagine you have a PostgreSQL database service for storing experiment metadata and an inference API service that needs access to trained models.
version: '3.8'
services:
db:
image: postgres:14-alpine
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
POSTGRES_DB: ml_metadata
volumes:
- postgres_data:/var/lib/postgresql/data # Mount 'postgres_data' volume
inference_api:
build: ./inference_service # Assumes a Dockerfile here
ports:
- "8000:8000"
volumes:
- ml_models:/app/models # Mount 'ml_models' volume
depends_on:
- db
volumes:
postgres_data:
ml_models:
Here's what's happening:
db service: We mount the postgres_data volume (declared at the top level) to the standard PostgreSQL data directory /var/lib/postgresql/data inside the container. Now, all database files created by PostgreSQL will persist in this Docker-managed volume, surviving container restarts or removals.inference_api service: We mount the ml_models volume to /app/models inside the container. If a training service (perhaps run separately or defined in the same Compose file) saves models into this volume, the inference_api service can load them from /app/models.This setup ensures data persistence and allows data sharing between containers if needed (though in this specific example, each volume is used by one service).
Diagram showing two services defined in Docker Compose, each mounting a distinct named volume managed by Docker.
Managing volumes through Docker Compose offers several benefits:
docker-compose.yml), making the configuration easy to understand and manage.docker-compose up).docker-compose down -v. This simplifies cleanup.In ML workflows managed with Compose, volumes are frequently used for:
By defining and mounting volumes within your docker-compose.yml file, you gain a manageable way to handle persistent data for your multi-container ML applications, ensuring that important information like models, datasets, and configurations survives the lifespan of individual containers. This approach simplifies setup, enhances reproducibility, and aligns with containerization best practices.
Was this section helpful?
docker-compose.yml file, specifically including the definition of services, networks, and volumes.© 2026 ApX Machine LearningEngineered with