DeepSpeed: Extreme-Scale Model Training for Everyone, Jeff Rasley, Samyam Rajbhandari, Kazem Cheshmi, Chris Ping, Yuxiong He, 2020KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Association for Computing Machinery (ACM))DOI: 10.1145/3394486.3403154 - 介绍了DeepSpeed框架,ZeRO是其核心组件之一,并讨论了其训练大规模模型的能力。