Linear models serve as a fundamental building block in supervised machine learning. They are appreciated for their simplicity, interpretability, and computational efficiency, often providing a solid baseline for more complex modeling tasks. In Julia, the MLJ.jl (Machine Learning in Julia) framework offers a consistent and powerful interface for implementing and evaluating these models. This section will guide you through building and training two common linear models: linear regression for continuous target variables and logistic regression for classification tasks.
You'll learn how to leverage MLJ.jl's workflow, which typically involves loading a model, preparing your data, wrapping the model and data into a 'machine', training the machine, and then making predictions.
Linear regression aims to model the relationship between a set of input features and a continuous output variable by fitting a linear equation to observed data. The basic form for a single feature is y=β0+β1x+ϵ, where y is the target, x is the feature, β0 is the intercept, β1 is the coefficient for the feature, and ϵ represents the error term. For multiple features, this extends to y=β0+β1x1+β2x2+⋯+βnxn+ϵ.
MLJ.jl provides access to various linear regression model implementations from different packages. A common choice is LinearRegressor
from the MLJLinearModels.jl
package.
Let's walk through an example. First, ensure you have the necessary packages. You can add them using Julia's package manager:
using Pkg
Pkg.add(["MLJ", "MLJLinearModels", "DataFrames", "Plots", "Random"]) # Add Plots for visualization if you wish
Now, let's implement a simple linear regression model.
using MLJ
using DataFrames
using Random
# Load the LinearRegressor model
LinearRegressor = @load LinearRegressor pkg=MLJLinearModels verbosity=0
# 1. Generate some synthetic data
Random.seed!(123) # for reproducibility
n_samples = 100
X_vec = rand(n_samples) .* 10
intercept = 2.0
coefficient = 3.5
noise = randn(n_samples) .* 2.0
y = intercept .+ coefficient .* X_vec .+ noise
# Convert to a DataFrame for X (MLJ expects a table for features)
# and ensure y is a vector
X = DataFrame(feature1 = X_vec)
# 2. Instantiate the model
model = LinearRegressor()
# 3. Create a machine by binding the model to the data
mach = machine(model, X, y)
# 4. Train the model (fit the machine)
fit!(mach, verbosity=0)
# 5. Make predictions on the training data
y_pred = predict(mach, X)
# 6. Inspect the learned parameters (optional)
fp = fitted_params(mach)
println("Learned intercept: $(fp.intercept)")
println("Learned coefficient: $(fp.coefs[1])") # Since we have one feature
In this example:
y
has a linear relationship with X_vec
(our feature1
), plus some random noise.LinearRegressor
from MLJLinearModels.jl
using the @load
macro. This macro dynamically loads the model code and makes it available.model = LinearRegressor()
.mach
) is constructed, which is an MLJ construct that binds a model to data. This machine contains the model, the data, and will store the learned parameters after training.fit!(mach)
trains the model. verbosity=0
suppresses output during fitting.predict(mach, X)
uses the trained machine to make predictions on new data (here, we use the original X
for simplicity).fitted_params(mach)
allows you to inspect the parameters learned by the model, such as the intercept and coefficients.For a dataset like this, you might visualize the actual data points and the fitted regression line.
A scatter plot of synthetic data points with the corresponding fitted linear regression line. The actual fitted line would be derived from
fp.intercept
andfp.coefs
.
Logistic regression is used for classification problems, typically when the target variable is categorical (e.g., true/false, cat/dog/bird). It models the probability that an input belongs to a particular class using a logistic function (or sigmoid function) applied to a linear combination of input features. For a binary classification problem, the probability of the positive class (y=1) is given by:
P(y=1∣X)=1+e−(β0+β1x1+⋯+βnxn)1MLJ.jl provides models like LogisticClassifier
from MLJLinearModels.jl
for this purpose.
A critical step when using MLJ.jl for classification is to ensure your target variable y
has the correct scientific type, typically Multiclass
or Finite
. You might need to coerce
your target vector.
Let's set up a binary classification example:
using MLJ
using DataFrames
using Random
using CategoricalArrays # For creating categorical target variables
# Load the LogisticClassifier model
LogisticClassifier = @load LogisticClassifier pkg=MLJLinearModels verbosity=0
# 1. Generate synthetic data for binary classification
Random.seed!(456)
n_samples = 100
X1 = randn(n_samples)
# Create a dependency for y on X1, e.g., if X1 > 0.5, more likely class "B"
y_logical = (X1 .+ 0.5 .* randn(n_samples)) .> 0.5
y_categorical = categorical(map(x -> x ? "B" : "A", y_logical))
# Convert X1 to a DataFrame
X = DataFrame(feature1 = X1)
# 2. Instantiate the model
model = LogisticClassifier()
# 3. Create a machine
# MLJ automatically infers the scientific type of y_categorical as Multiclass
mach = machine(model, X, y_categorical)
# 4. Train the model
fit!(mach, verbosity=0)
# 5. Make predictions
# Predict probabilities (returns a distribution for each instance)
y_prob_dist = predict(mach, X)
# To get the probability of a specific class (e.g., "B")
prob_B = pdf.(y_prob_dist, "B") # pdf is probability density function
# Predict the most likely class
y_pred_mode = predict_mode(mach, X)
# 6. Inspect learned parameters (optional)
fp = fitted_params(mach)
println("Learned coefficients: $(fp.coefs)")
println("Learned intercept: $(fp.intercept)")
# Compare actual vs predicted
# println("Actual classes: $(y_categorical[1:5])")
# println("Predicted classes: $(y_pred_mode[1:5])")
# println("Predicted probabilities for class 'B': $(round.(prob_B[1:5], digits=2))")
In this logistic regression example:
X1
influences the binary outcome y_categorical
("A" or "B"). Importantly, y_categorical
is created as a CategoricalVector
which MLJ.jl understands as Multiclass
. If your target is, for example, Vector{String}
, you would need to coerce
it using y = coerce(y_raw, Multiclass)
.LogisticClassifier
is loaded and instantiated.X
and y_categorical
.fit!(mach)
trains the logistic regression model.predict(mach, X)
for classifiers in MLJ.jl typically returns a vector of probability distributions. You can extract probabilities for specific classes using pdf(distribution, class_label)
.predict_mode(mach, X)
directly gives the class with the highest predicted probability.fitted_params(mach)
again shows the learned model parameters.Visualizing logistic regression often involves plotting the sigmoid curve if you have one feature, or the decision boundary for two features.
Illustrative plot showing data points for two classes and a sigmoid curve representing the probability of belonging to 'Class B' based on 'Feature (X1)'. The actual curve would depend on the fitted model parameters.
MLJ.jl relies heavily on ScientificTypes.jl
to ensure data is in the correct format for models. For linear regression, the target y
should be Continuous
. For classification, the target y
should be Finite
(e.g., Multiclass
or OrderedFactor
). Features in X
are typically Continuous
or Count
.
If your data isn't in the expected scientific type, you'll need to use coerce
or coerce!
. For example:
# For a target vector `y_raw` intended for regression:
# y = coerce(y_raw, Continuous)
# For a feature matrix `X_table` where all columns should be continuous:
# X = coerce(X_table, Continuous)
# Or, for specific columns:
# X = coerce(X_table, :col1 => Continuous, :col2 => Count)
Always check the scitype
of your data (scitype(X)
, scitype(y)
) and the model's expected types (input_scitype(model)
, target_scitype(model)
) if you encounter issues.
Overfitting can be an issue in linear models, especially with many features. Regularization techniques add a penalty term to the loss function to shrink coefficient values, which can improve generalization to unseen data.
Many linear model implementations in MLJ.jl (like those in MLJLinearModels.jl
) allow you to specify regularization. For example:
# Ridge Regressor
RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels verbosity=0
ridge_model = RidgeRegressor(lambda=1.0) # lambda is the regularization strength
# Lasso Regressor
LassoRegressor = @load LassoRegressor pkg=MLJLinearModels verbosity=0
lasso_model = LassoRegressor(lambda=0.1)
# Logistic Classifier with L2 penalty
# LogisticClassifier from MLJLinearModels typically includes L2 by default,
# lambda controls the strength.
logreg_model_reg = LogisticClassifier(lambda=0.5)
The lambda
(or sometimes alpha
) parameter controls the strength of the regularization. Choosing the right value for lambda
is usually done via hyperparameter tuning, which is covered later in this chapter.
Linear models, whether for regression or classification, are versatile tools. Their implementation in Julia using MLJ.jl follows a consistent pattern that simplifies model building, training, and prediction. Understanding this workflow is essential as you move on to more complex algorithms and build comprehensive machine learning pipelines.
Was this section helpful?
© 2025 ApX Machine Learning