Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - This comprehensive textbook covers the mathematical foundations of neural networks, activation functions, loss functions (including MSE and BCE), and the general architecture and training of autoencoders. Chapter 14 is dedicated to autoencoders.
Reducing the Dimensionality of Data with Neural Networks, Geoffrey E. Hinton and Ruslan R. Salakhutdinov, 2006Science, Vol. 313 (American Association for the Advancement of Science)DOI: 10.1126/science.1127647 - A seminal paper that revived interest in deep autoencoders by demonstrating their effectiveness for dimensionality reduction and unsupervised pre-training, laying the groundwork for many subsequent developments in representation learning.
Pattern Recognition and Machine Learning, Christopher M. Bishop, 2006 (Springer) - This textbook provides a strong probabilistic foundation for machine learning algorithms, including neural networks, loss functions, and optimization methods. It offers an alternative, yet complementary, perspective to deep learning specific texts.