This hands-on lab focuses on the practical application of gradient boosting by building, evaluating, and interpreting a gradient boosting model for a classification problem. We will use the GradientBoostingClassifier from Scikit-Learn on a well-known dataset to demonstrate the entire workflow in action.
For this exercise, we'll use the wine recognition dataset, which is conveniently available within Scikit-Learn. The task is to predict the class of wine (one of three cultivars) based on 13 chemical properties. This dataset is excellent for our purposes because it contains only numerical features and has a clear classification objective.
Our first step is to import the necessary libraries and load the dataset. We'll use pandas to manage our data in a DataFrame, which makes it easier to inspect and manipulate. We will also split the data into training and testing sets to ensure we can evaluate our model's performance on unseen data.
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score, classification_report
# Load the dataset
wine = load_wine()
X = pd.DataFrame(wine.data, columns=wine.feature_names)
y = pd.Series(wine.target)
# Display the first few rows of the features
print("Features (X):")
print(X.head())
# Display the target variable distribution
print("\nTarget (y) value counts:")
print(y.value_counts())
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y
)
print(f"\nTraining set shape: {X_train.shape}")
print(f"Testing set shape: {X_test.shape}")
The stratify=y argument in train_test_split ensures that the proportion of each wine class is the same in both the training and testing sets as it is in the original dataset. This is a good practice for classification problems to prevent imbalances from affecting model training and evaluation.
With our data prepared, we can now instantiate and train the GradientBoostingClassifier. We will start with a basic set of parameters. n_estimators=100 means we will build 100 sequential trees. The learning_rate of 0.1 controls the contribution of each tree, and max_depth=3 limits the complexity of each individual tree to prevent overfitting. The random_state ensures that our results are reproducible.
# Initialize the GradientBoostingClassifier
gbm = GradientBoostingClassifier(
n_estimators=100,
learning_rate=0.1,
max_depth=3,
random_state=42
)
# Fit the model to the training data
gbm.fit(X_train, y_train)
The .fit() method initiates the training process. The model sequentially adds trees, with each new tree attempting to correct the errors made by the ensemble of preceding trees.
After training, we use the test set, which the model has not seen, to evaluate its predictive power. We use the .predict() method to get the predicted class for each sample in the test set.
# Make predictions on the test set
y_pred = gbm.predict(X_test)
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.4f}\n")
# Display a detailed classification report
print("Classification Report:")
print(classification_report(y_test, y_pred, target_names=wine.target_names))
The output will show a high accuracy score and a classification report with precision, recall, and F1-score for each wine class. This gives us a comprehensive view of the model's ability to correctly classify samples across all categories.
A good prediction is useful, but understanding why the model makes certain predictions is often more valuable. We will now interpret our trained model by examining feature importances and visualizing partial dependence.
The feature_importances_ attribute of a trained GBM model provides a score for each feature, indicating how useful it was in constructing the decision trees within the ensemble. A higher score means the feature was used more frequently and effectively to make splits that improve the model's purity.
Let's visualize the importances to quickly identify the most influential features.
import pandas as pd
# Get feature importances
importances = gbm.feature_importances_
feature_names = X.columns
# Create a DataFrame for easier visualization
feature_importance_df = pd.DataFrame(
{'feature': feature_names, 'importance': importances}
).sort_values('importance', ascending=False)
print("Top 10 Feature Importances:")
print(feature_importance_df.head(10))
To make this information more accessible, we can create a bar chart.
The most important features for classifying the wine cultivar, according to the model. Features like
prolineandflavanoidshave a significant impact on the model's predictions.
This chart clearly shows that proline, flavanoids, and color_intensity are the three most significant drivers of the model's predictions. The remaining features have a much smaller contribution.
While feature importance tells us which features are important, Partial Dependence Plots (PDPs) show us how a feature affects the model's predictions. A PDP isolates the effect of one feature while averaging out the effects of all other features.
Let's create PDPs for the two most important features: proline and flavanoids. We'll use Scikit-Learn's PartialDependenceDisplay utility. The y-axis on these plots shows the "centered" log-odds of the prediction for each class.
from sklearn.inspection import PartialDependenceDisplay
import matplotlib.pyplot as plt
# Create partial dependence plots for the top two features
fig, ax = plt.subplots(figsize=(12, 5))
display = PartialDependenceDisplay.from_estimator(
gbm,
X_train,
features=['proline', 'flavanoids'],
feature_names=X.columns.tolist(),
target_names=wine.target_names,
ax=ax
)
plt.suptitle("Partial Dependence of Top Features on Wine Class Prediction")
plt.show()
By examining the generated plots, you can deduce relationships. For example, the PDP for proline will likely show that as the proline value increases, the probability of the wine belonging to 'class_0' also increases significantly, while the probability for the other classes decreases. Similarly, the plot for flavanoids might show a positive relationship with 'class_0' and 'class_1' up to a certain point, after which the effect levels off or changes. These visualizations provide direct evidence of the marginal effect of each feature on the model's output.
This practical exercise has taken you through the complete lifecycle of a machine learning project using Scikit-Learn's gradient boosting implementation: from data preparation to model training, evaluation, and finally, interpretation. You now have a solid foundation for applying this powerful algorithm to your own datasets and are well-prepared to explore the more advanced, performance-oriented libraries in the upcoming chapters.
Was this section helpful?
GradientBoostingClassifier and related utilities.© 2026 ApX Machine LearningEngineered with