docker run
docker-compose.yml
Now that we've discussed the essential Dockerfile
instructions and best practices, let's put theory into practice. In this hands-on exercise, we will build a simple Docker image containing a standard environment for Machine Learning tasks using Python and the popular Scikit-learn library, along with pandas for data manipulation. This image will serve as a consistent foundation for developing and running Scikit-learn based models.
Goal: Create a Docker image with Python 3.9, pip, Scikit-learn, and pandas installed.
Prerequisites:
First, create a dedicated directory for this exercise. This helps keep our files organized.
mkdir sklearn-env
cd sklearn-env
Inside this directory, we will place our Dockerfile
and a requirements.txt
file.
requirements.txt
Using a requirements.txt
file is the standard way to manage Python package dependencies. Docker can leverage this file to install the necessary libraries efficiently.
Create a file named requirements.txt
inside the sklearn-env
directory with the following content:
# requirements.txt
scikit-learn==1.2.2
pandas==1.5.3
Note: We are pinning specific versions for reproducibility. You might adjust these versions based on your project needs, but using specific versions is generally recommended for consistent environments.
Dockerfile
Now, create the core file: Dockerfile
(no extension) in the sklearn-env
directory. Add the following instructions:
# Dockerfile for a basic Scikit-learn environment
# 1. Choose the base image
FROM python:3.9-slim
# 2. Set the working directory inside the container
WORKDIR /app
# 3. Copy the requirements file first to leverage Docker cache
COPY requirements.txt .
# 4. Install the Python dependencies
RUN pip install --no-cache-dir -r requirements.txt
# 5. (Optional) Set a default command to run when the container starts
CMD ["python"]
Let's break down this Dockerfile
:
FROM python:3.9-slim
: We start with an official Python base image. We chose version 3.9 and the slim
variant, which is smaller than the default tag, reducing our final image size while still providing a functional Python environment.WORKDIR /app
: This sets the working directory for subsequent commands (COPY
, RUN
, CMD
, ENTRYPOINT
) inside the container. If the directory doesn't exist, WORKDIR
creates it. It's good practice to set a dedicated working directory.COPY requirements.txt .
: This copies the requirements.txt
file from your build context (the sklearn-env
directory) into the container's working directory (/app
). We copy this before running pip install
. Docker builds images in layers. If requirements.txt
hasn't changed since the last build, Docker can reuse the cached layer from the subsequent RUN
instruction, significantly speeding up rebuilds.RUN pip install --no-cache-dir -r requirements.txt
: This executes the pip install
command inside the container.
--no-cache-dir
: This option disables pip's cache, which helps keep the image size smaller as the cache isn't needed in the final image.-r requirements.txt
: Tells pip to install the packages listed in the specified file.CMD ["python"]
: This defines the default command to run when a container is started from this image without specifying a command. In this case, it will launch an interactive Python interpreter.Now, navigate to your sklearn-env
directory in your terminal (if you aren't already there) and run the docker build
command:
docker build -t sklearn-env:1.0 .
Let's dissect this command:
docker build
: The command to build an image from a Dockerfile
.-t sklearn-env:1.0
: The -t
flag tags the image with a name and optionally a tag (version). Here, we name it sklearn-env
and tag it 1.0
. Tagging makes it easier to reference the image later..
: This specifies the build context. Docker looks for the Dockerfile
in the current directory (.
) and sends the files in this directory (and its subdirectories) to the Docker daemon to execute the build.You will see Docker executing each step defined in your Dockerfile
. It will download the base image (if not already present) and then run each instruction, creating layers for the image.
Once the build completes successfully, you can verify that your environment is set up correctly. Run a temporary container from the image and check the installed package versions:
docker run --rm -it sklearn-env:1.0 python -c "import sklearn; import pandas; print(f'Scikit-learn version: {sklearn.__version__}'); print(f'Pandas version: {pandas.__version__}')"
docker run
: The command to run a container from an image.--rm
: Automatically removes the container when it exits. This is useful for short-lived verification tasks.-it
: Runs the container in interactive mode (-i
) and allocates a pseudo-TTY (-t
), allowing you to interact with it (though not strictly necessary for this specific command, it's common practice).sklearn-env:1.0
: The name and tag of the image we just built.python -c "..."
: This overrides the default CMD
and executes a specific Python command inside the container. The command imports the libraries and prints their versions.You should see output similar to this (exact versions might differ slightly if you changed requirements.txt
):
Scikit-learn version: 1.2.2
Pandas version: 1.5.3
If you see the version numbers printed correctly, congratulations! You have successfully built a Docker image containing a reproducible Scikit-learn environment. This image can now be shared with colleagues or used as a base for containerizing specific training or inference scripts, ensuring everyone works with the same set of dependencies.
© 2025 ApX Machine Learning