Pruning Filters for Efficient ConvNets, Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, Hans Peter Graf, 2017International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1608.08710 - A seminal work that proposes pruning entire filters/channels in convolutional neural networks, demonstrating the practical speedups of structured sparsity.
A Guide to N:M Sparsity with NVIDIA Ampere GPUs, NVIDIA Developer, 2021 (NVIDIA) - Explains the N:M semi-structured sparsity feature of NVIDIA Ampere GPUs, illustrating how specific hardware support can accelerate certain structured pruning patterns.
What is the State of Neural Network Pruning?, Jonah Blalock, Jose Javier Gonzalez Ortiz, Jonathan Frankle, John D. Moeller, Tyler McCormick, Vivek S. Borkar, 2022Nature Machine Intelligence, Vol. 5 (Springer Nature)DOI: 10.1038/s42256-022-00462-0 - A comprehensive survey reviewing various neural network pruning techniques, including unstructured and structured methods, their theoretical underpinnings, and practical implications.