Pattern Recognition and Machine Learning, Christopher M. Bishop, 2006 (Springer)DOI: 10.1007/b139369 - A classic textbook providing a comprehensive probabilistic approach to machine learning, discussing loss functions like MSE in the context of model training and evaluation.
CS229 Lecture Notes: Supervised Learning, Andrew Ng, Tengyu Ma, 2018 (Stanford University, Computer Science Department) - Official lecture notes from a renowned machine learning course, offering a clear explanation of regression models and their associated error metrics, including MSE.