Machine Learning, Andrew Ng, 2022 (DeepLearning.AI and Stanford Online) - An widely acclaimed introductory course that clearly explains cost functions, particularly Mean Squared Error, within the context of linear regression and gradient descent.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Trevor Hastie, Robert Tibshirani, and Jerome Friedman, 2009 (Springer) - A foundational textbook, 2nd edition, that provides a rigorous statistical and mathematical treatment of loss functions, including squared error, and their application in model fitting and optimization.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - This authoritative textbook offers a comprehensive theoretical foundation for various loss functions and optimization techniques relevant to machine learning, suitable for readers seeking deeper mathematical understanding.