Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio, 2010Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Vol. 9 (PMLR) - Introduces the Xavier (Glorot) initialization method to stabilize signal propagation in deep networks with sigmoid/tanh activations.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A foundational textbook offering a comprehensive discussion on neural network initialization, covering theory and practical aspects.