Mixed-Precision Training, Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Bruce Ginsburg, Boris Ginsburg, Andrew H. Lastra, Andrew Levenberg, Hao Nguyen, Oleksandr Patmochnyk, Ganesh Seetharaman, D. Shane Snyder, Gregory F. Tang, Valerie Tarashchansky, Galen Wasserman, Barry Whaley, Pieter van der Wijngaart, 2018International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1710.03740 - 介绍了使用FP16和FP32的混合精度训练,概述了损失缩放以及对敏感操作和累加器使用更高精度的技术。
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning, Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy, 201813th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) (USENIX Association) - 介绍了TVM,一个端到端优化编译器,通过其中间表示和调度支持混合精度和量化,与编译器策略相关。