Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A canonical textbook covering the theoretical foundations and practical applications of deep learning, including comprehensive discussions on activation functions.
Deep Sparse Rectifier Networks, Xavier Glorot, Antoine Bordes, and Yoshua Bengio, 2011Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS), Vol. 15 (Proceedings of Machine Learning Research (PMLR))DOI: 10.55986/aistats2011.glorot11a - Introduces the Rectified Linear Unit (ReLU) activation function and demonstrates its benefits for deep neural networks.