MLIR Quantization Dialect, The MLIR Team, 2024 (LLVM Project) - Describes the data types and operations for quantization in MLIR, including how scales and zero points are represented in the Intermediate Representation.
Quantization for Deep Learning: A Comprehensive Survey, Amir Gholami, Song Han, Kai-Wen Chang, Forrest Iandola, Sreyash Kenkre, Brandon Wu, Michael W. Mahoney, 2021Synthesis Lectures on Computer Architecture, Vol. 16 (Morgan & Claypool Publishers)DOI: 10.2200/S01089ED1V01Y202103CAC001 - Provides a broad overview of deep learning quantization techniques, covering different schemes, algorithms, and practical considerations for compilers and hardware integration.