Greedy Function Approximation: A Gradient Boosting Machine, Jerome H. Friedman, 2001The Annals of Statistics, Vol. 29DOI: 10.1214/aos/1013203451 - This is the seminal paper introducing the Gradient Boosting Machine (GBM) algorithm, detailing its formulation as gradient descent in function space and its generalization of previous boosting methods.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, 2009 (Springer) - This influential textbook provides a comprehensive explanation of Gradient Boosting, including its theoretical foundations and practical aspects, often referencing Friedman's work directly.
CS229 Lecture Notes: Boosting, John Duchi, 2018Stanford University CS229: Machine Learning Course Notes (Stanford University) - These lecture notes from a reputable machine learning course provide a concise explanation of boosting methods, including the progression from AdaBoost to Gradient Boosting and the functional gradient descent view.