Python toolkit for building production-ready LLM applications. Modular utilities for prompts, RAG, agents, structured outputs, and multi-provider support.
Was this section helpful?
Distilling the Knowledge in a Neural Network, Geoffrey Hinton, Oriol Vinyals, Jeff Dean, 2015arXiv preprint arXiv:1503.02531DOI: 10.48550/arXiv.1503.02531 - The foundational paper introducing the concept of knowledge distillation, particularly focusing on using softened output probabilities (soft targets) and temperature scaling.
FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1412.6550 - Introduces the concept of feature-based knowledge distillation, where the student model learns from intermediate representations of the teacher.