FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2014arXiv preprint arXiv:1412.6550DOI: 10.48550/arXiv.1412.6550 - Introduces feature-map-based knowledge distillation, a key approach for intermediate representation matching.
Similarity-Preserving Knowledge Distillation, Frederick Tung, Kofi Osei Koyejo, 2019Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (IEEE)DOI: 10.1109/ICCV.2019.00145 - Presents Contrastive Representation Distillation (CRD), an objective based on contrastive learning for knowledge transfer.