GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism, Yanping Huang, Youlong Cheng, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Thang Luong, Yonghui Wu, Zhifeng Chen, 2019Advances in Neural Information Processing Systems (NeurIPS) (NeurIPS Foundation)DOI: 10.48550/arXiv.1905.01329 - Introduces the GPipe algorithm for pipeline parallelism, detailing the use of micro-batching to reduce idle time.
Distributed communication package - torch.distributed, PyTorch Contributors, 2025 (PyTorch) - Official documentation covering PyTorch's distributed communication primitives, fundamental for manual pipeline parallelism implementations.