Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
NVIDIA TensorRT Developer Guide, NVIDIA Corporation, 2024 (NVIDIA) - Provides comprehensive guidance on optimizing and deploying deep learning models, including quantization, on NVIDIA GPUs by leveraging Tensor Cores and the CUDA platform.