For classification problems, Scikit-Learn provides the GradientBoostingClassifier class, an implementation of the Gradient Boosting Machine tailored for predicting categorical outcomes. This classifier builds an additive model in a forward, stage-wise fashion. At each stage, a regression tree is fitted on the negative gradient of the binomial or multinomial deviance loss function. This process allows the model to incrementally improve its performance by focusing on the observations that are difficult to classify correctly.
The mechanics are a direct application of the principles from the previous chapter. Instead of fitting trees to minimize squared error as one would in a simple regression context, GradientBoostingClassifier minimizes a loss function suitable for classification, such as deviance (also known as log-loss or logistic loss).
For binary classification, the model makes an initial prediction, often the log-odds of the positive class. Then, for each boosting stage:
This sequential process refines the model's prediction, gradually pushing the predicted log-odds in the right direction to correctly classify the training samples.
A diagram of the iterative process in Gradient Boosting. Each new weak learner () is trained to correct the errors (gradients) of the previous model, and its contribution is scaled by the learning rate () before updating the overall prediction ().
When you instantiate GradientBoostingClassifier, you can configure several parameters that significantly influence its behavior. While we will cover tuning in detail in a later chapter, it is important to understand the main ones from the start.
loss: The loss function to be optimized. The default is 'log_loss' which supports both binary and multiclass classification.learning_rate: A float between 0.0 and 1.0 that scales the contribution of each tree. A lower learning rate requires more boosting stages to achieve the same level of training error but often results in better generalization. This is a form of regularization.n_estimators: The number of boosting stages to perform. This is the total number of trees in the ensemble. More trees can lead to better performance on the training data, but also to overfitting if the number is too high.max_depth: The maximum depth of the individual regression estimators. The maximum depth limits the number of nodes in the tree, controlling its complexity. A smaller depth reduces variance and helps prevent overfitting.subsample: The fraction of samples to be used for fitting the individual base learners. If smaller than 1.0, this results in Stochastic Gradient Boosting, which can reduce variance and improve model generalization at the cost of increased bias.Let's walk through a simple example of using GradientBoostingClassifier. We will use Scikit-Learn's make_classification to generate a synthetic dataset, then train a model and evaluate its performance.
# Import necessary libraries
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score
# 1. Generate a synthetic dataset
X, y = make_classification(
n_samples=1000,
n_features=20,
n_informative=10,
n_redundant=5,
n_classes=2,
random_state=42
)
# 2. Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42
)
# 3. Initialize the GradientBoostingClassifier
# We'll set a few parameters for this example
gb_clf = GradientBoostingClassifier(
n_estimators=100, # Number of trees
learning_rate=0.1, # Step size shrinkage
max_depth=3, # Max depth of each tree
subsample=0.8, # Fraction of samples for training each tree
random_state=42
)
# 4. Fit the model to the training data
gb_clf.fit(X_train, y_train)
# 5. Make predictions on the test data
y_pred = gb_clf.predict(X_test)
# 6. Evaluate the model's performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.4f}")
# Example Output:
# Model Accuracy: 0.8900
In this code, we initialize a GradientBoostingClassifier with 100 trees (n_estimators=100), a learning rate of 0.1, and a maximum tree depth of 3. We also use stochastic gradient boosting by setting subsample=0.8, meaning each tree is trained on a random 80% of the training data. After fitting the model, we use it to make predictions and find that it achieves a respectable accuracy on our synthetic test set.
This class provides a solid foundation for tackling classification tasks. The next section will introduce its counterpart for regression problems, the GradientBoostingRegressor.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with