ZeRO: Memory Optimizations Towards Training Trillion Parameter Models, Samyam Anand, Olatunji Ruwase, Jeff Rasley, Shaden Smith, Deepthi Karkada, Reza Yazdani Aminabadi, Ronald Pope, Sam Ade Jacobs, Yuxiong He, 2021SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (ACM)DOI: 10.1145/3458817.3476202 - 介绍了ZeRO(零冗余优化器)内存优化策略系列,它们是DeepSpeed中数据并行的基础,对训练大型模型至关重要。