The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, 2009 (Springer) - A textbook covering statistical learning methods, including Gradient Boosting, logistic regression, cross-entropy, and the derivation of gradients for various loss functions (2nd edition).
Pattern Recognition and Machine Learning, Christopher Bishop, 2006 (Springer) - Provides a detailed treatment of probabilistic machine learning, including logistic regression, softmax, and cross-entropy error functions.