Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Efficient Transformers: A Survey, Yi Tay, Mostafa Dehghani, Dara Bahri, Donald Metzler, 2022ACM Computing Surveys, Vol. 55 (Association for Computing Machinery)DOI: 10.1145/3530811 - Provides a comprehensive overview of techniques for making Transformer models more efficient, including architectural modifications and attention alternatives.