Once you have MLflow set up, the next step is to instrument your machine learning code to record the important details of each training run. At the core of experiment tracking are parameters and metrics. Parameters represent the input configurations for a run, such as hyperparameters or feature selection choices. Metrics represent the output or results, typically evaluation scores or loss values, that quantify the performance of the run.

MLflow provides a straightforward Python API to log this information. The two primary functions you'll use are mlflow.log_param() and mlflow.log_metric().

Logging Parameters

Parameters are the settings you define before starting a training run. They often include:

Hyperparameters: Learning rate, batch size, number of layers, regularization strength (alpha), tree depth, etc.
Input Features: Which features were used for training.
Environment Details: Sometimes it's useful to log library versions or dataset identifiers (though we'll see more sophisticated ways to link data versions later).

Parameters are typically logged once at the beginning of an MLflow run. You log a single parameter using mlflow.log_param(key, value), where key is a string name for the parameter and value is its value (can be string, numeric, or boolean).

For logging multiple parameters at once, you can use mlflow.log_params(params_dict), where params_dict is a dictionary of parameter names and values.

Let's see how this looks in practice. Imagine you are training a classification model. You might want to log the regularization strength (alpha) and the type of solver used.

import mlflow
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

# Load data (example)
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define parameters
alpha_val = 0.1
solver_type = 'liblinear'

# Start an MLflow run
with mlflow.start_run():
    # Log parameters
    mlflow.log_param("alpha", alpha_val)
    mlflow.log_param("solver", solver_type)
    print(f"Logging parameters: alpha={alpha_val}, solver={solver_type}")

    # Instantiate and train the model
    lr = LogisticRegression(C=1/alpha_val, solver=solver_type, random_state=42)
    lr.fit(X_train, y_train)

    # Make predictions
    y_pred = lr.predict(X_test)

    # Calculate accuracy (metric)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Model Accuracy: {accuracy:.4f}")

    # Log metric (covered next)
    mlflow.log_metric("accuracy", accuracy)

print("MLflow run completed.")

In this snippet, inside the mlflow.start_run() context, we explicitly call mlflow.log_param() for alpha and solver. These values will now be associated with this specific run in the MLflow tracking system.

Logging Metrics

Metrics are the quantitative outputs of your run that measure performance or behavior. Common examples include:

Evaluation Metrics: Accuracy, Precision, Recall, F1-score, AUC, Mean Squared Error (MSE), R-squared.
Loss Values: Training loss, validation loss.
Resource Usage: Training time, GPU memory usage (less common for basic tracking but possible).

Metrics can be logged at any point during the run using mlflow.log_metric(key, value, step=None).

key: A string name for the metric.
value: The numeric value of the metric.
step: An optional integer representing a sequence or time step (like an epoch number). If provided, MLflow records the metric's history over these steps.

Similar to parameters, you can log multiple metrics using a dictionary with mlflow.log_metrics(metrics_dict, step=None).

Let's extend the previous example to log the final accuracy score:

# (Previous code: imports, data loading, parameter definition)

# Start an MLflow run
with mlflow.start_run():
    # Log parameters
    mlflow.log_param("alpha", alpha_val)
    mlflow.log_param("solver", solver_type)

    # Instantiate and train the model
    lr = LogisticRegression(C=1/alpha_val, solver=solver_type, random_state=42)
    lr.fit(X_train, y_train)

    # Make predictions
    y_pred = lr.predict(X_test)

    # Calculate accuracy
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Model Accuracy: {accuracy:.4f}")

    # Log the final accuracy metric
    mlflow.log_metric("accuracy", accuracy)
    print(f"Logged metric: accuracy={accuracy:.4f}")

print("MLflow run completed.")

Here, mlflow.log_metric("accuracy", accuracy) records the final performance score.

Logging Metrics Over Time

Often, you'll want to track how a metric changes during training, for example, the loss after each epoch. The step argument is designed for this.

Consider a simplified training loop where we simulate epoch-based training and log loss at each step:

import mlflow
import time
import random

# Start an MLflow run
with mlflow.start_run():
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_param("epochs", 5)

    print("Simulating training loop...")
    # Simulate training for several epochs
    initial_loss = 1.0
    for epoch in range(5):
        # Simulate training work
        time.sleep(0.5)
        # Calculate simulated loss (decreasing randomly)
        current_loss = initial_loss * (1 - random.uniform(0.1, 0.3))
        initial_loss = current_loss

        # Log loss metric for this epoch
        mlflow.log_metric("train_loss", current_loss, step=epoch)
        print(f"Epoch {epoch}: Logged train_loss={current_loss:.4f}")

    # Log a final metric (e.g., validation accuracy)
    final_val_accuracy = 0.85 + random.uniform(-0.05, 0.05)
    mlflow.log_metric("validation_accuracy", final_val_accuracy)
    print(f"Final validation_accuracy={final_val_accuracy:.4f}")

print("MLflow run completed.")

In this example, mlflow.log_metric("train_loss", current_loss, step=epoch) logs the loss value for each epoch. When you view this run in the MLflow UI, you'll be able to see a plot showing how the training loss decreased over the epochs.

Example visualization of how training loss logged with steps might appear in the MLflow UI.

By consistently logging parameters and metrics, you create a detailed record of each experiment. This makes it significantly easier to understand what settings produced which results, compare different runs, and reproduce successful outcomes later. The information logged using these functions becomes the foundation for analyzing your experiments in the MLflow UI, which we will explore in subsequent sections.