Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A foundational textbook that provides comprehensive coverage of optimization challenges, including local minima and saddle points, specifically within the context of deep learning.
The Loss Surfaces of Multilayer Networks, Anna Choromanska, Mikael Henaff, Michael Mathieu, Gerard Ben Arous, Yann LeCun, 2015Proceedings of Machine Learning Research, Vol. 38 (Proceedings of Machine Learning Research) - This paper discusses the properties of loss surfaces in deep neural networks, suggesting that many local minima may be equivalent in terms of performance, influencing the perception of local minima as less problematic for very deep models.