Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism, Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Mia Xu Chen, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, Zhifeng Chen, 2019Advances in Neural Information Processing Systems 32 (NeurIPS 2019)DOI: 10.5555/3454287.3455110 - Introduces a pipeline parallelism approach that improves hardware utilization with micro-batching.