Quantization and Training of Neural Networks for Efficient Inference, Benoit Jacob, Skirmantas Kligys, Shengkuan Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko, Vivienne Sze, 20182018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE)DOI: 10.1109/CVPR.2018.00696 - Introduces a widely adopted post-training quantization method for 8-bit integers, providing a basis for many practical implementations.
Quantization for PyTorch Models, PyTorch Documentation, 2024 (PyTorch) - Official documentation explaining how quantization is implemented in a popular deep learning framework, including support for various integer types.