Expert Parallelism: Distributing Experts Across Devices
New · Open Source
Kerb - LLM Development Toolkit
Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts, Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V Le, Yonghui Wu, Zhifeng Chen, Claire Cui, 2021ICML 2022DOI: 10.48550/arXiv.2112.06905 - Presents a large-scale Mixture-of-Experts model, detailing practical considerations and distributed training aspects relevant to expert partitioning.