The Lottery Ticket Hypothesis: Finding Sparse, Trainable Subnetworks, Jonathan Frankle, Michael Carbin, 2019ICLR 2019DOI: 10.48550/arXiv.1803.03635 - This paper proposes the 'Lottery Ticket Hypothesis,' which offers a theoretical justification for the effectiveness of iterative magnitude pruning, demonstrating that dense, randomly initialized networks contain sparse subnetworks that can be trained in isolation to achieve comparable accuracy.
Rethinking the Value of Network Pruning, Zhuang Liu, Mingxing Tan, Edward Albert, Quoc V. Le, 2018Proceedings of Machine Learning Research, Volume 97 (ICLR 2019), Vol. 97 (PMLR (Proceedings of Machine Learning Research))DOI: 10.48550/arXiv.1810.05270 - This research critically evaluates existing pruning techniques, including magnitude-based methods, and discusses the discrepancy between sparsity and actual inference speedup, emphasizing the challenges of unstructured sparsity on general-purpose hardware.
Sparse Training: A Survey, Thomas Gale, Erich Elsen, Sara Hooker, 2020arXiv preprint arXiv:1912.06733DOI: 10.48550/arXiv.1912.06733 - A comprehensive survey that reviews various methods for sparse neural network training, including magnitude-based pruning as a foundational approach, offering a broad overview of the field and its challenges.