Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Quantization with bitsandbytes and Transformers, Hugging Face, 2024 - Practical guide to applying quantization techniques, including GPTQ, to large language models using the Hugging Face Transformers library, illustrating its real-world application.