Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio, 2010Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Vol. 9 (JMLR.org) - Introduces Xavier/Glorot initialization, a foundational technique for stabilizing deep neural network training, especially with sigmoid or tanh activations.
torch.nn.init: Initialization, PyTorch Developers, 2022 (PyTorch) - Official documentation for weight initialization functions in PyTorch, including Kaiming (He), Xavier/Glorot, Normal, and Orthogonal initialization methods.