Greedy Function Approximation: A Gradient Boosting Machine, Jerome H. Friedman, 2001The Annals of Statistics, Vol. 29 (Institute of Mathematical Statistics)DOI: 10.1214/aos/1013203451 - This paper introduced the Gradient Boosting Machine algorithm, detailing its functional gradient descent formulation and the application of various loss functions for regression and classification tasks.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, Jerome Friedman, 2009 (Springer) - This textbook offers an extensive treatment of Gradient Boosting, explaining its mechanics, the role of loss functions, and their influence on model behavior in statistical learning.
Robust Statistics: Theory and Methods, Peter J. Huber, Elvezio M. Ronchetti, 2009 (John Wiley & Sons, Inc.)DOI: 10.1002/9780470434232 - This book provides a foundational understanding of robust statistical methods, including the mathematical properties and motivations behind robust loss functions like Huber loss and L1 loss, which are robust to outliers.