This section guides you through the practical application of training and evaluating supervised learning models using MLJ.jl. We will work with a familiar dataset, implement a couple of different models, evaluate their performance using cross-validation, and then see how hyperparameter tuning can improve results. By following along, you'll gain confidence in applying the MLJ.jl workflow to your own supervised learning problems.
First, ensure you have MLJ.jl and other necessary packages installed. We'll start by loading these packages and the dataset we'll be using for this exercise. The Iris dataset is a classic choice for classification tasks, and MLJ.jl provides an easy way to load it.
using MLJ
using DataFrames
using PrettyPrinting # For nicer output of MLJ objects
using StableRNGs # For reproducible results
# Load the Iris dataset
X, y = @load_iris; # X is a table of features, y is a categorical vector of targets
# For reproducibility in data splitting and model training
rng = StableRNG(123)
# Display the first few rows of features and targets
first(X, 3) |> pretty
first(y, 3) |> pretty
The features X
are measurements of sepal length, sepal width, petal length, and petal width for 150 iris flowers. The target y
is the species of each flower.
Next, we split our data into training and testing sets. This is a standard practice to evaluate how well our model generalizes to unseen data.
# Split data into training and test sets (70% train, 30% test)
train_rows, test_rows = partition(eachindex(y), 0.7, rng=rng);
X_train = X[train_rows, :];
y_train = y[train_rows];
X_test = X[test_rows, :];
y_test = y[test_rows];
Let's start with a linear model: Logistic Regression (also known as MultinomialClassifier
in MLJ.jl for multiclass problems).
@load
to make the model type available.machine
object.fit!
.predict
on the test set.# Load the MultinomialClassifier model type
LogisticClassifier = @load MultinomialClassifier pkg=MLJLinearModels verbosity=0
# Instantiate the model
logreg_model = LogisticClassifier()
# Wrap the model and training data in a machine
logreg_machine = machine(logreg_model, X_train, y_train)
# Train the model
fit!(logreg_machine, verbosity=0)
# Make predictions on the test set
y_pred_logreg = predict(logreg_machine, X_test)
# Evaluate accuracy
accuracy_logreg = accuracy(mode.(y_pred_logreg), y_test) # mode is used as predict returns distributions
println("Logistic Regression Accuracy: $(round(accuracy_logreg, digits=3))")
You should see an accuracy score printed. For the Iris dataset, Logistic Regression often performs quite well. Let's assume for this walkthrough it achieved an accuracy of around 0.956.
Now, let's try a different type of model, a Decision Tree Classifier.
# Load the DecisionTreeClassifier model type
DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree verbosity=0
# Instantiate the model
tree_model = DecisionTreeClassifier(rng=deepcopy(rng)) # Pass rng for reproducibility of the tree itself
# Wrap in a machine
tree_machine = machine(tree_model, X_train, y_train)
# Train the model
fit!(tree_machine, verbosity=0)
# Make predictions
y_pred_tree = predict(tree_machine, X_test)
# Evaluate accuracy
accuracy_tree = accuracy(mode.(y_pred_tree), y_test)
println("Decision Tree Accuracy (default): $(round(accuracy_tree, digits=3))")
Decision trees, with default parameters, might give slightly different results. For instance, we might observe an accuracy of around 0.933.
Evaluating on a single train-test split can sometimes be misleading due to the specific way the data was divided. Cross-validation provides a more estimate of model performance. MLJ.jl's evaluate!
function makes this straightforward. Let's use 6-fold cross-validation for our Decision Tree model.
# Define a resampling strategy: 6-fold cross-validation
cv_strategy = CV(nfolds=6, rng=deepcopy(rng))
# Evaluate the Decision Tree model using cross-validation
# We use the 'model' directly, not the machine already bound to data
# We specify verbosity=0 to suppress output during evaluation
tree_eval = evaluate(tree_model, X_train, y_train,
resampling=cv_strategy,
measure=accuracy,
verbosity=0)
# Display the evaluation results
println("Decision Tree Cross-Validation Results:")
println("Mean Accuracy: $(round(tree_eval.measurement[1], digits=3))")
println("Per-fold Accuracy: $(round.(tree_eval.per_fold[1], digits=3))")
The output will show the accuracy for each fold and the mean accuracy across all folds. This gives us a better idea of how the model is likely to perform on average. For example, the mean accuracy might be around 0.945.
Most machine learning models have hyperparameters that can be tuned to improve performance. For a DecisionTreeClassifier
, one such hyperparameter is max_depth
, which controls the maximum depth of the tree. Let's tune this using a Grid
search strategy.
max_depth
to try.Grid
search.TunedModel
: Wrap the base model, tuning strategy, resampling strategy, and parameter ranges.TunedModel
: This process trains models for each hyperparameter combination and selects the best one.# Define the range for the max_depth hyperparameter
tree_model_tunable = DecisionTreeClassifier() # Fresh instance for tuning
r_max_depth = range(tree_model_tunable, :max_depth, lower=1, upper=10, scale=:linear);
# Define the tuning strategy (Grid search)
tuning_strategy = Grid(resolution=10) # resolution means 10 values in the range
# Define resampling for tuning (e.g., 3-fold CV to speed up tuning)
resampling_strategy_tuning = CV(nfolds=3, rng=deepcopy(rng))
# Create a TunedModel
tuned_tree_model = TunedModel(model=tree_model_tunable,
resampling=resampling_strategy_tuning,
tuning=tuning_strategy,
range=r_max_depth,
measure=accuracy,
train_best=true) # Automatically retrain the best model on full training data
# Wrap the TunedModel in a machine and fit it
tuned_tree_machine = machine(tuned_tree_model, X_train, y_train)
fit!(tuned_tree_machine, verbosity=0)
# Inspect the report for tuning results
tuning_report = report(tuned_tree_machine)
best_model_params = tuning_report.best_model
best_max_depth = best_model_params.max_depth
println("Best max_depth found: $best_max_depth")
# Extract the fitted parameters of the best model
fitted_params(tuned_tree_machine).best_model |> pretty
# Evaluate the tuned model on the test set
y_pred_tuned_tree = predict(tuned_tree_machine, X_test)
accuracy_tuned_tree = accuracy(mode.(y_pred_tuned_tree), y_test)
println("Tuned Decision Tree Accuracy: $(round(accuracy_tuned_tree, digits=3))")
After tuning, you might find that a specific max_depth
(e.g., 3 or 4) yields better performance on the cross-validation sets used during tuning. Applying this tuned model to our hold-out test set might result in an improved accuracy, say around 0.978.
We've now trained and evaluated a few models. It's often helpful to visualize their performance.
Comparison of accuracy scores for Logistic Regression, a default Decision Tree, and a Decision Tree after hyperparameter tuning on the Iris test set. Illustrative values show potential improvements.
This hands-on exercise demonstrated the core workflow in MLJ.jl for supervised learning:
You are encouraged to experiment further. Try different models available in MLJ.jl, explore other hyperparameters, or apply these techniques to other datasets. The skills you've practiced here form the foundation for building more complex machine learning solutions in Julia.
Was this section helpful?
© 2025 ApX Machine Learning