Mixed-Precision Training of Deep Neural Networks, Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu, 2018International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1710.03740 - 这篇论文介绍了混合精度训练的技术,包括梯度缩放,这些是 torch.cuda.amp 的基础。