Distance-aware Quantization (ICCV 2021)

The discretizer takes a full-precision input, and then assigns it to the nearest quantized value, e.g,    in this example. We interpret the assignment process of a discretizer as follows: It first computes the distances between the full-precision input and quantized values,    and   , and then applies an argmin operator over the distances to choose the quantized value. Since this operator is non-differentiable, the quantized network cannot be trained end-to-end with gradient-based optimizers.

Authors

* corresponding author

Abstract

We address the problem of network quantization, that is, reducing bit-widths of weights and/or activations to lighten network architectures. Quantization methods use a rounding function to map full-precision values to the nearest quantized ones, but this operation is not differentiable. There are mainly two approaches to training quantized networks with gradient-based optimizers. First, a straight-through estimator (STE) replaces the zero derivative of the rounding with that of an identity function, which causes a gradient mismatch problem. Second, soft quantizers approximate the rounding with continuous functions at training time, and exploit the rounding for quantization at test time. This alleviates the gradient mismatch, but causes a quantizer gap problem. We alleviate both problems in a unified framework. To this end, we introduce a novel quantizer, dubbed a distance-aware quantizer (DAQ), that mainly consists of a distance-aware soft rounding (DASR) and a temperature controller. To alleviate the gradient mismatch problem, DASR approximates the discrete rounding with the kernel soft argmax, which is based on our insight that the quantization can be formulated as a distance-based assignment problem between full-precision values and quantized ones. The controller adjusts the temperature parameter in DASR adaptively according to the input, addressing the quantizer gap problem. Experimental results on standard benchmarks show that DAQ outperforms the state of the art significantly for various bit-widths without bells and whistles.

Overview of our framework

Our quantizer    mainly consists of DASR with a temperature controller. DAQ first normalizes a full-precision input   . DASR inputs the normalized input, and computes distance scores w.r.t quantized values. It then assigns the input to the nearest quantized value   . For the assignment, we exploit a differentiable version of the argmax with an adaptive temperature   , obtained from our controller.

Experiment

Quantitative results of ResNet-18 on the validation split of ImageNet. We report the top-1 accuracy for comparison. We denote by "W" and "A" the bit-precision of weights and activations, respectively. "FP" and    represent accuracies for full-precision and fully quantized models, respectively. Numbers in bold indicate the best performance. Numbers in parentheses are accuracy improvements or degradations compared to the full-precision one.

Paper

D. Kim, J. Lee, B. Ham
Distance-aware Quantization
In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , 2021
[Paper on arXiv]

Code

Training/testing code (Pytorch)

BibTeX

@InProceedings{Kim21,
        author       = "D. Kim, J. Lee, B. Ham",
        title        = "Distance-aware Quantization",
        booktitle    = "ICCV",
        year         = "2021",
        }

Acknowledgements

This research was supported by the Samsung Research Funding & Incubation Center for Future Technology (SRFC-IT1802-06).