GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism, Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Mia Xu Chen, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, Zhifeng Chen, 2019Advances in Neural Information Processing Systems 32 (NeurIPS 2019)DOI: 10.5555/3454287.3455110 - Introduces a pipeline parallelism approach that improves hardware utilization with micro-batching.