Following our examination of XGBoost and LightGBM, we now focus on CatBoost, a gradient boosting library optimized for a specific, yet common, challenge: effectively handling categorical features. Standard approaches often involve preprocessing steps that can be suboptimal or lead to issues like target leakage. CatBoost integrates innovative solutions for categorical data directly into its algorithm.
This chapter covers:
Upon completing this chapter, you will grasp CatBoost's distinct methods and be prepared to apply them, particularly to problems involving significant categorical data.
6.1 Motivation: Challenges with Categorical Data
6.2 Ordered Target Statistics (Ordered TS)
6.3 Addressing Prediction Shift: Ordered Boosting
6.4 Handling Feature Combinations
6.5 Oblivious Trees
6.6 GPU Training Acceleration
6.7 CatBoost API: Parameters and Configuration
6.8 Hands-on Practical: Implementing CatBoost
© 2025 ApX Machine Learning