Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016 (MIT Press) - A comprehensive textbook covering deep learning fundamentals, including various weight initialization strategies like Xavier and He, and their theoretical basis.
torch.nn.init, PyTorch Developers, 2022 (PyTorch Foundation) - Official documentation for PyTorch's weight initialization functions, including kaiming_normal_ and kaiming_uniform_, detailing their usage and parameters.
Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio, 2010Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Vol. 9 (PMLR) - Introduced Xavier initialization, a predecessor to Kaiming initialization, designed for symmetric activation functions and providing context for the challenges Kaiming addresses.