Greedy Function Approximation: A Gradient Boosting Machine, Jerome H. Friedman, 2001The Annals of Statistics, Vol. 29 (Institute of Mathematical Statistics)DOI: 10.2307/2699986 - Introduces the fundamental gradient boosting machine algorithm, which forms the basis for understanding its iterative nature and susceptibility to overfitting.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, 2009 (Springer) - Provides a statistical treatment of gradient boosting, including a detailed analysis of the bias-variance trade-off and the mechanisms leading to overfitting.
Ensemble Methods: Foundations and Algorithms, Zhi-Hua Zhou, 2012 (Chapman and Hall/CRC)DOI: 10.1201/b12207 - A comprehensive textbook dedicated to ensemble learning, offering insights into boosting's behavior, its strengths, and its inherent challenges with overfitting when not properly controlled.