Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio, 2010Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS), Vol. 9 (JMLR.org) - Introduces Xavier (Glorot) initialization, a foundational method for stabilizing neural network training, particularly for sigmoid and tanh activations.
torch.nn.init - PyTorch documentation, PyTorch Core Team, 2022 (PyTorch) - Official documentation for PyTorch's weight initialization functions, detailing their usage and parameters.
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook covering various aspects of deep learning, including detailed explanations of parameter initialization methods and their theoretical basis.