The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, 2009 (Springer) - A standard textbook that provides a statistical framework for many machine learning algorithms, including linear models, decision trees, and ensemble methods like bagging and boosting.
Random Forests, Leo Breiman, 2001Machine Learning, Vol. 45 (Springer)DOI: 10.1023/A:1010933404324 - The original academic paper introducing the Random Forest algorithm, detailing its construction and benefits in reducing overfitting.
XGBoost: A Scalable Tree Boosting System, Tianqi Chen, Carlos Guestrin, 2016Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM)DOI: 10.1145/2939672.2939785 - This paper presents XGBoost, a popular and efficient implementation of gradient boosting with a focus on scalability and regularization.
LightGBM: A Highly Efficient Gradient Boosting Decision Tree, Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu, 2017Advances in Neural Information Processing Systems 30 (NIPS 2017), Vol. 30 (Curran Associates, Inc.) - The research paper introducing LightGBM, highlighting its advancements in training speed and efficiency for gradient boosting models, especially on large datasets.