Model Pruning and Adaptation for Device Constraints
Was this section helpful?
Once-for-All: Train One Network and Specialize It for Efficient Deployment, Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, Song Han, 2020International Conference on Learning RepresentationsDOI: 10.48550/arXiv.1908.09791 - This paper introduces the Once-for-All (OFA) network architecture, a flexible framework that allows training a single large network from which numerous specialized sub-networks can be extracted and deployed efficiently without further retraining, directly relevant to model adaptation.
Distilling the knowledge in a neural network, Geoffrey Hinton, Oriol Vinyals, Jeff Dean, 2015arXiv preprint arXiv:1503.02531DOI: 10.48550/arXiv.1503.02531 - This foundational paper introduces the concept of knowledge distillation, a method for transferring knowledge from a large, complex 'teacher' model to a smaller, more efficient 'student' model, which is a core technique for model adaptation.
Pruning Filters for Efficient ConvNets, Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, Hans Peter Graf, 2017International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1608.08710 - This paper proposes a structured pruning method focusing on removing entire filters from convolutional neural networks, demonstrating how to achieve significant model compression while maintaining accuracy, directly related to the provided example of filter pruning.