Reducing the Dimensionality of Data with Neural Networks, Geoffrey E. Hinton, Ruslan R. Salakhutdinov, 2006Science, Vol. 313 (American Association for the Advancement of Science)DOI: 10.1126/science.1127647 - This paper introduced deep autoencoders, demonstrating their ability to learn compact, low-dimensional representations of high-dimensional data, laying the groundwork for many subsequent autoencoder developments.
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - Provides a comprehensive treatment of autoencoders, their training mechanisms, various loss functions (MSE, cross-entropy), backpropagation, and optimization algorithms like SGD and Adam.
Adam: A Method for Stochastic Optimization, Diederik P. Kingma, Jimmy Ba, 2015International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1412.6980 - Introduces the Adam optimizer, widely used in deep learning for its efficiency and effectiveness in training neural networks.
CS231n: Convolutional Neural Networks for Visual Recognition, Fei-Fei Li, Yunzhu Li, Ruohan Gao, 2023 - Offers detailed explanations of neural network training, backpropagation, loss functions, and optimization algorithms, which are directly applicable to autoencoder training.