Layer Normalization, Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton, 2016arXiv preprint arXiv:1607.06450DOI: 10.48550/arXiv.1607.06450 - Presents Layer Normalization, an alternative to Batch Normalization that is independent of batch size.
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, 2016 (MIT Press) - Offers a comprehensive treatment of deep learning foundations, including optimization challenges and normalization techniques.
Group Normalization, Yuxin Wu, Kaiming He, 2019International Journal of Computer Vision (IJCV), Vol. 128 (Springer US)DOI: 10.1007/s11263-019-01198-w - Proposes Group Normalization as an effective alternative for CNNs when Batch Normalization is not feasible due to small batch sizes.