As you begin tracking multiple machine learning trials using mlflow.log_param()
, mlflow.log_metric()
, and mlflow.log_artifact()
, your collection of runs can grow rapidly. Imagine training dozens of models for predicting customer churn, each with slightly different hyperparameters or feature sets. Simply listing all these runs together makes it challenging to isolate those related to a specific goal or approach. MLflow provides a straightforward organizational structure to manage this complexity: Experiments.
An MLflow Experiment is essentially a named container for a group of related runs. Think of it as a folder or a project space dedicated to a particular machine learning problem or objective you are exploring. For instance, you might create separate experiments for:
customer-churn-prediction
product-recommendation-engine
sentiment-analysis-reviews
All runs associated with developing the churn prediction model would then be logged under the customer-churn-prediction
experiment, keeping them separate from runs related to product recommendations.
When you start logging runs without explicitly specifying an experiment, MLflow uses a default experiment. If you're logging to a local mlruns
directory (the default backend), MLflow creates an experiment named Default
with an ID of 0
. While functional for quick tests, relying solely on the default experiment quickly leads to a disorganized collection of runs from potentially unrelated projects.
To effectively organize your work, you should create and assign experiments explicitly. You can manage experiments using the MLflow Python API.
To create a new experiment programmatically, use the mlflow.create_experiment()
function. It requires a name for the experiment and optionally accepts an artifact_location
. If you don't provide an artifact_location
, MLflow will use a default location relative to the tracking backend (e.g., a subdirectory within mlruns
for the local filesystem backend).
import mlflow
# Define the name for your new experiment
experiment_name = "house-price-prediction-xgboost"
try:
# Create the experiment. It returns the ID of the created experiment.
experiment_id = mlflow.create_experiment(name=experiment_name)
print(f"Experiment '{experiment_name}' created with ID: {experiment_id}")
except mlflow.exceptions.MlflowException as e:
# Handle cases where the experiment might already exist
print(f"Experiment '{experiment_name}' already exists.")
# Optionally, get the ID of the existing experiment
experiment = mlflow.get_experiment_by_name(experiment_name)
experiment_id = experiment.experiment_id
print(f"Using existing experiment ID: {experiment_id}")
This code attempts to create an experiment. If an experiment with that name already exists, it catches the exception and retrieves the existing experiment's ID using mlflow.get_experiment_by_name()
.
Once you have an experiment (either newly created or existing), you need to tell MLflow to log subsequent runs within that experiment's context. The primary way to do this programmatically is using mlflow.set_experiment()
. You can provide either the experiment name or the experiment ID. If you provide a name that doesn't correspond to an existing experiment, mlflow.set_experiment()
will conveniently create it for you.
import mlflow
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import pandas as pd
import numpy as np
# Define the experiment name
experiment_name = "simple-linear-regression"
# Set the active experiment. Creates it if it doesn't exist.
mlflow.set_experiment(experiment_name)
# Sample data generation (replace with your actual data loading)
X = np.random.rand(100, 1) * 10
y = 2.5 * X.squeeze() + np.random.randn(100) * 2
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Start an MLflow run within the active experiment
with mlflow.start_run(run_name="baseline_run") as run:
print(f"Starting run {run.info.run_id} in experiment {run.info.experiment_id}")
# Instantiate and train the model
lr = LinearRegression()
lr.fit(X_train, y_train)
# Make predictions
predictions = lr.predict(X_test)
# Calculate metrics
rmse = np.sqrt(mean_squared_error(y_test, predictions))
# Log parameters and metrics
mlflow.log_param("model_type", "LinearRegression")
mlflow.log_param("train_test_split_random_state", 42)
mlflow.log_metric("rmse", rmse)
print(f"Logged RMSE: {rmse}")
print(f"Run completed. Find details in the MLflow UI.")
In this example, mlflow.set_experiment("simple-linear-regression")
ensures that the subsequent mlflow.start_run()
block logs its parameters and metrics under the "simple-linear-regression" experiment. If this experiment didn't exist before running the script, MLflow would create it automatically.
Alternatively, you can specify the experiment directly within mlflow.start_run()
using the experiment_id
argument, but this requires the experiment to exist beforehand.
The real benefit of organizing runs into experiments becomes apparent when using the MLflow UI. When you launch the UI (typically by running mlflow ui
in your terminal from the directory containing the mlruns
folder), you'll see a list of your experiments in the left-hand navigation panel.
Hierarchical organization in MLflow: The UI displays Experiments, each containing multiple Runs.
Selecting an experiment filters the main view to show only the runs associated with that experiment. This allows you to:
While MLflow doesn't enforce naming rules beyond uniqueness, adopting consistent conventions helps maintain clarity:
fraud-detection-v1
, recommendation-api-model
).churn-prediction-feature-engineering
, churn-prediction-hyperparameter-tuning
).test
or my-experiment
.By thoughtfully grouping your runs into experiments, you transform MLflow from a simple logger into a powerful organizational tool. This structured approach is fundamental for managing the iterative process of model development, enabling easier analysis, comparison, and reproduction of your machine learning work.
© 2025 ApX Machine Learning