torch.nn.Module, PyTorch Contributors, 2024 (PyTorch Foundation) - Official documentation for PyTorch's base class for all neural network modules, essential for implementing custom layers like the routing strategies shown.
Designing and Training Sparse Mixture-of-Experts Models for Language, Elena Albarran, William Fedus, Andrew M. Dai, Sharan Narang, Noam Shazeer, 2023 (Google AI Blog) - Provides an accessible overview of Mixture-of-Experts models, including discussions on routing mechanisms and the importance of load balancing.