This hands-on lab focuses on the practical application of gradient boosting by building, evaluating, and interpreting a gradient boosting model for a classification problem. We will use the GradientBoostingClassifier from Scikit-Learn on a well-known dataset to demonstrate the entire workflow in action.The Dataset: Wine RecognitionFor this exercise, we'll use the wine recognition dataset, which is conveniently available within Scikit-Learn. The task is to predict the class of wine (one of three cultivars) based on 13 chemical properties. This dataset is excellent for our purposes because it contains only numerical features and has a clear classification objective.Step 1: Loading and Preparing the DataOur first step is to import the necessary libraries and load the dataset. We'll use pandas to manage our data in a DataFrame, which makes it easier to inspect and manipulate. We will also split the data into training and testing sets to ensure we can evaluate our model's performance on unseen data.import pandas as pd from sklearn.datasets import load_wine from sklearn.model_selection import train_test_split from sklearn.ensemble import GradientBoostingClassifier from sklearn.metrics import accuracy_score, classification_report # Load the dataset wine = load_wine() X = pd.DataFrame(wine.data, columns=wine.feature_names) y = pd.Series(wine.target) # Display the first few rows of the features print("Features (X):") print(X.head()) # Display the target variable distribution print("\nTarget (y) value counts:") print(y.value_counts()) # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=42, stratify=y ) print(f"\nTraining set shape: {X_train.shape}") print(f"Testing set shape: {X_test.shape}")The stratify=y argument in train_test_split ensures that the proportion of each wine class is the same in both the training and testing sets as it is in the original dataset. This is a good practice for classification problems to prevent imbalances from affecting model training and evaluation.Step 2: Training the Gradient Boosting ClassifierWith our data prepared, we can now instantiate and train the GradientBoostingClassifier. We will start with a basic set of parameters. n_estimators=100 means we will build 100 sequential trees. The learning_rate of 0.1 controls the contribution of each tree, and max_depth=3 limits the complexity of each individual tree to prevent overfitting. The random_state ensures that our results are reproducible.# Initialize the GradientBoostingClassifier gbm = GradientBoostingClassifier( n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42 ) # Fit the model to the training data gbm.fit(X_train, y_train)The .fit() method initiates the training process. The model sequentially adds trees, with each new tree attempting to correct the errors made by the ensemble of preceding trees.Step 3: Evaluating Model PerformanceAfter training, we use the test set, which the model has not seen, to evaluate its predictive power. We use the .predict() method to get the predicted class for each sample in the test set.# Make predictions on the test set y_pred = gbm.predict(X_test) # Calculate accuracy accuracy = accuracy_score(y_test, y_pred) print(f"Model Accuracy: {accuracy:.4f}\n") # Display a detailed classification report print("Classification Report:") print(classification_report(y_test, y_pred, target_names=wine.target_names))The output will show a high accuracy score and a classification report with precision, recall, and F1-score for each wine class. This gives us a comprehensive view of the model's ability to correctly classify samples across all categories.Step 4: Interpreting the ModelA good prediction is useful, but understanding why the model makes certain predictions is often more valuable. We will now interpret our trained model by examining feature importances and visualizing partial dependence.Feature ImportanceThe feature_importances_ attribute of a trained GBM model provides a score for each feature, indicating how useful it was in constructing the decision trees within the ensemble. A higher score means the feature was used more frequently and effectively to make splits that improve the model's purity.Let's visualize the importances to quickly identify the most influential features.import pandas as pd # Get feature importances importances = gbm.feature_importances_ feature_names = X.columns # Create a DataFrame for easier visualization feature_importance_df = pd.DataFrame( {'feature': feature_names, 'importance': importances} ).sort_values('importance', ascending=False) print("Top 10 Feature Importances:") print(feature_importance_df.head(10))To make this information more accessible, we can create a bar chart.{"layout":{"title":"Feature Importance in Wine Classification","xaxis":{"title":"Importance Score"},"yaxis":{"title":"Feature","autorange":"reversed"},"template":"plotly_white","margin":{"l":150,"r":20,"t":50,"b":70}},"data":[{"type":"bar","orientation":"h","y":["proline","flavanoids","color_intensity","alcohol","od280/od315_of_diluted_wines","hue","total_phenols"],"x":[0.381,0.224,0.145,0.089,0.062,0.041,0.023],"marker":{"color":"#228be6"}}]}The most important features for classifying the wine cultivar, according to the model. Features like proline and flavanoids have a significant impact on the model's predictions.This chart clearly shows that proline, flavanoids, and color_intensity are the three most significant drivers of the model's predictions. The remaining features have a much smaller contribution.Partial Dependence PlotsWhile feature importance tells us which features are important, Partial Dependence Plots (PDPs) show us how a feature affects the model's predictions. A PDP isolates the effect of one feature while averaging out the effects of all other features.Let's create PDPs for the two most important features: proline and flavanoids. We'll use Scikit-Learn's PartialDependenceDisplay utility. The y-axis on these plots shows the "centered" log-odds of the prediction for each class.from sklearn.inspection import PartialDependenceDisplay import matplotlib.pyplot as plt # Create partial dependence plots for the top two features fig, ax = plt.subplots(figsize=(12, 5)) display = PartialDependenceDisplay.from_estimator( gbm, X_train, features=['proline', 'flavanoids'], feature_names=X.columns.tolist(), target_names=wine.target_names, ax=ax ) plt.suptitle("Partial Dependence of Top Features on Wine Class Prediction") plt.show()By examining the generated plots, you can deduce relationships. For example, the PDP for proline will likely show that as the proline value increases, the probability of the wine belonging to 'class_0' also increases significantly, while the probability for the other classes decreases. Similarly, the plot for flavanoids might show a positive relationship with 'class_0' and 'class_1' up to a certain point, after which the effect levels off or changes. These visualizations provide direct evidence of the marginal effect of each feature on the model's output.This practical exercise has taken you through the complete lifecycle of a machine learning project using Scikit-Learn's gradient boosting implementation: from data preparation to model training, evaluation, and finally, interpretation. You now have a solid foundation for applying this powerful algorithm to your own datasets and are well-prepared to explore the more advanced, performance-oriented libraries in the upcoming chapters.