Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity, William Fedus, Barret Zoph, Noam Shazeer, 2022Journal of Machine Learning Research, Vol. 23 - This work presents the Switch Transformer architecture, directly addressing communication overhead and load imbalance in distributed MoE training through effective strategies like a load balancing loss.