Quantization and Pruning Techniques for LLM Deployment
New · Open Source
Kerb - LLM Development Toolkit
Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Quantization for Deep Learning Models, PyTorch Documentation, 2019 (PyTorch Foundation) - Official documentation for PyTorch's quantization module, covering post-training quantization (PTQ) and quantization-aware training (QAT).