PyTorch implementation of QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
QDrop is a simple yet effective approach, which randomly drops the quantization of activations during reconstruction to pursue flatter model on both calibration and test data. QDrop is easy to implement for various neural networks including CNNs and Transformers, and plug-and-play with little additional computational complexity.
Our method has been integrated into quantization benchmark MQBench. And the docs can be found here https://mqbench.readthedocs.io/en/latest/.
Moreover, obeying the design style of quantization structure in MQBench, we also implement another form of QDrop in branch "qdrop". You can use any branch you like. Details seen in the readme in branch "qdrop"
Go into the exp directory and you can see run.sh and config.sh. run.sh represents a example for resnet18 w2a2. You can run config.sh to produce similar scripts across bits and archs.
run.sh
#!/bin/bash
PYTHONPATH=../../../../:$PYTHONPATH \
python ../../../main_imagenet.py --data_path data_path \
--arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 2 --act_quant --order together --wwq --waq --awq --aaq \
--weight 0.01 --input_prob 0.5 --prob 0.5
config.sh
#!/bin/bash
# pretrain models and hyperparameters following BRECQ
arch=('resnet18' 'resnet50' 'mobilenetv2' 'regnetx_600m' 'regnetx_3200m' 'mnasnet')
weight=(0.01 0.01 0.1 0.01 0.01 0.2)
w_bit=(3 2 2 4)
a_bit=(3 4 2 4)
for((i=0;i<6;i++))
do
for((j=0;j<4;j++))
do
path=w${w_bit[j]}a${a_bit[j]}/${arch[i]}
mkdir -p $path
echo $path
cp run.sh $path/run.sh
sed -re "s/weight([[:space:]]+)0.01/weight ${weight[i]}/" -i $path/run.sh
sed -re "s/resnet18/${arch[i]}/" -i $path/run.sh
sed -re "s/n_bits_w([[:space:]]+)2/n_bits_w ${w_bit[j]}/" -i $path/run.sh
sed -re "s/n_bits_a([[:space:]]+)2/n_bits_a ${a_bit[j]}/" -i $path/run.sh
done
done
Then you can get a series of scripts and run it directly to get the following results.
Results on low-bit activation in terms of accuracy on ImageNet.
Methods | Bits (W/A) | Res18 | Res50 | MNV2 | Reg600M | Reg3.2G | MNasx2 |
---|---|---|---|---|---|---|---|
Full Prec. | 32/32 | 71.06 | 77.00 | 72.49 | 73.71 | 78.36 | 76.68 |
QDrop | 4/4 | 69.07 | 75.03 | 67.91 | 70.81 | 76.36 | 72.81 |
QDrop | 2/4 | 64.49 | 70.09 | 53.62 | 63.36 | 71.69 | 63.22 |
QDrop | 3/3 | 65.57 | 71.28 | 55.00 | 64.84 | 71.70 | 64.44 |
QDrop | 2/2 | 51.76 | 55.36 | 10.21 | 38.35 | 54.00 | 23.62 |
Case 1, Case 2, Case 3
To compare the results of 3 Cases mentioned in the observation part of the method, we can use the following commands.
Case 1: weight tuning doesn't feel any activation quantization
Case 2: weight tuning feels full activation quantization
Case 3: weight tuning feels part activation quantization
# Case 1
python main_imagenet.py --data_path data_path --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 2 --act_quant --order after --wwq --awq --aaq --input_prob 1.0 --prob 1.0
# Case 2
python main_imagenet.py --data_path data_path --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 2 --act_quant --order before --wwq --waq --aaq --input_prob 1.0 --prob 1.0
# Case 3
python main_imagenet.py --data_path data_path --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 2 --act_quant --order after --wwq --waq --awq --aaq --input_prob 1.0 --prob 1.0
No Drop
To compare with QDrop, No Drop can be achieved by turning the probability to 1.0 to disable dropping quantization during weight tuning.
python main_imagenet.py --data_path data_path --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 2 --act_quant --order together --wwq --waq --awq --aaq --input_prob 1.0 --prob 1.0
If you find this repo useful for your research, please consider citing the paper:
@article{wei2022qdrop,
title={QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization},
author={Wei, Xiuying and Gong, Ruihao and Li, Yuhang and Liu, Xianglong and Yu, Fengwei},
journal={arXiv preprint arXiv:2203.05740},
year={2022}
}