diff --git a/docs/model_zoo/classification.rst b/docs/model_zoo/classification.rst index 7b8b62ff43..07d03a5e63 100644 --- a/docs/model_zoo/classification.rst +++ b/docs/model_zoo/classification.rst @@ -82,6 +82,8 @@ ImageNet - Download weights given a hashtag: ``net = get_model('ResNet50_v1d', pretrained='117a384e')`` + ``ResNet50_v1_int8`` and ``MobileNet1.0_int8`` are quantized model calibrated on ImageNet dataset. + .. role:: tag ResNet @@ -89,6 +91,8 @@ ResNet .. hint:: + - ``ResNet50_v1_int8`` is a quantized model for ``ResNet50_v1``. + - ``ResNet_v1b`` modifies ``ResNet_v1`` by setting stride at the 3x3 layer for a bottleneck block. - ``ResNet_v1c`` modifies ``ResNet_v1b`` by replacing the 7x7 conv layer with three 3x3 conv layers. @@ -107,6 +111,8 @@ ResNet +---------------------------+--------+--------+----------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------+ | ResNet50_v1 [1]_ | 77.36 | 93.57 | cc729d95 | `shell script `_ | `log `_ | +---------------------------+--------+--------+----------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------+ + | ResNet50_v1_int8 [1]_ | 76.86 | 93.46 | cc729d95 | | | + +---------------------------+--------+--------+----------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------+ | ResNet101_v1 [1]_ | 78.34 | 94.01 | d988c13d | `shell script `_ | `log `_ | +---------------------------+--------+--------+----------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------+ | ResNet152_v1 [1]_ | 79.22 | 94.64 | acfd0970 | `shell script `_ | `log `_ | @@ -177,6 +183,10 @@ ResNext MobileNet --------- +.. hint:: + + - ``MobileNet1.0_int8`` is a quantized model for ``MobileNet1.0``. + .. table:: :widths: 45 5 5 10 20 15 @@ -185,6 +195,8 @@ MobileNet +==========================+========+========+==========+=========================================================================================================================================+===============================================================================================================================+ | MobileNet1.0 [4]_ | 73.28 | 91.30 | efbb2ca3 | `shell script `_ | `log `_ | +--------------------------+--------+--------+----------+-----------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------+ + | MobileNet1.0_int8 [4]_ | 72.85 | 90.99 | efbb2ca3 | | | + +--------------------------+--------+--------+----------+-----------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------+ | :tag:`MobileNet1.0` [4]_ | 72.93 | 91.14 | cce75496 | `shell script `_ | `log `_ | +--------------------------+--------+--------+----------+-----------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------+ | MobileNet0.75 [4]_ | 70.25 | 89.49 | 84c801e2 | `shell script `_ | `log `_ | diff --git a/docs/model_zoo/detection.rst b/docs/model_zoo/detection.rst index b3273ee647..6cd4691a22 100644 --- a/docs/model_zoo/detection.rst +++ b/docs/model_zoo/detection.rst @@ -34,6 +34,8 @@ and their performances with more details. - ``(320x320)`` indicate that the model was evaluated with resolution 320x320. If not otherwise specified, all detection models in GluonCV can take various input shapes for prediction. Some models are trained with various input data shapes, e.g., Faster-RCNN and YOLO models. + - ``ssd_300_vgg16_atrous_voc_int8`` is a quantized model calibrated on Pascal VOC dataset for ``ssd_300_vgg16_atrous_voc``. + .. hint:: The training commands work with the following scripts: @@ -53,6 +55,8 @@ Pascal VOC The VOC metric, mean Average Precision (mAP) across all classes with IoU threshold 0.5 is reported. + Quantized SSD models are evaluated with ``nms_thresh=0.45``, ``nms_topk=200``. + SSD --- @@ -62,17 +66,25 @@ Checkout SSD demo tutorial here: :ref:`sphx_glr_build_examples_detection_demo_ss .. table:: :widths: 50 5 25 20 - +----------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ - | Model | mAP | Training Command | Training log | - +==================================+=======+======================================================================================================================================+=====================================================================================================================================+ - | ssd_300_vgg16_atrous_voc [1]_ | 77.6 | `shell script `_ | `log `_ | - +----------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ - | ssd_512_vgg16_atrous_voc [1]_ | 79.2 | `shell script `_ | `log `_ | - +----------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ - | ssd_512_resnet50_v1_voc [1]_ | 80.1 | `shell script `_ | `log `_ | - +----------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ - | ssd_512_mobilenet1.0_voc [1]_ | 75.4 | `shell script `_ | `log `_ | - +----------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | Model | mAP | Training Command | Training log | + +========================================+=======+======================================================================================================================================+=====================================================================================================================================+ + | ssd_300_vgg16_atrous_voc [1]_ | 77.6 | `shell script `_ | `log `_ | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | ssd_300_vgg16_atrous_voc_int8* [1]_ | 77.46 | | | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | ssd_512_vgg16_atrous_voc [1]_ | 79.2 | `shell script `_ | `log `_ | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | ssd_512_vgg16_atrous_voc_int8* [1]_ | 78.39 | | | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | ssd_512_resnet50_v1_voc [1]_ | 80.1 | `shell script `_ | `log `_ | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | ssd_512_resnet50_v1_voc_int8* [1]_ | 80.16 | | | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | ssd_512_mobilenet1.0_voc [1]_ | 75.4 | `shell script `_ | `log `_ | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ + | ssd_512_mobilenet1.0_voc_int8* [1]_ | 75.04 | | | + +----------------------------------------+-------+--------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ Faster-RCNN ----------- diff --git a/docs/tutorials/deployment/int8_inference.py b/docs/tutorials/deployment/int8_inference.py new file mode 100644 index 0000000000..0d91f5f651 --- /dev/null +++ b/docs/tutorials/deployment/int8_inference.py @@ -0,0 +1,82 @@ +"""3. Inference with Quantized Models +===================================== + +This is a tutorial which illustrates how to use quantized GluonCV +models for inference on Intel Xeon Processors to gain higher performance. + +The following example requires ``GluonCV>=0.4`` and ``MXNet-mkl>=1.5.0b20190314``. Please follow `our installation guide <../../index.html#installation>`__ to install or upgrade GluonCV and nightly build of MXNet if necessary. + +Introduction +------------ + +GluonCV delivered some quantized models to improve the performance and reduce the deployment costs for the computer vision inference tasks. In real production, there are two main benefits of lower precision (INT8). First, the computation can be accelerated by the low precision instruction, like Intel Vector Neural Network Instruction (VNNI). Second, lower precision data type would save the memory bandwidth and allow for better cache locality and save the power. The new feature can get up to 2X performance speedup in the current AWS EC2 CPU instances and will reach 4X under the `Intel Deep Learning Boost (VNNI) `_ enabled hardware with less than 0.5% accuracy drop. + +Please checkout `verify_pretrained.py `_ for imagenet inference +and `eval_ssd.py `_ for SSD inference. + +Performance +----------- + +GluonCV supports some quantized classification models and detection models. +For the throughput, the target is to achieve the maximum machine efficiency to combine the inference requests together and get the results by one iteration. From the bar-chart, it is clearly that the quantization approach improved the throughput from 1.46X to 2.71X for selected models. +Below CPU performance is from AWS EC2 C5.18xlarge with 18 cores. + +.. figure:: https://user-images.githubusercontent.com/17897736/54540947-dc08c480-49d3-11e9-9a0d-a97d44f9792c.png + :alt: Gluon Quantization Performance + + Gluon Quantization Performance + ++-----------------------+----------+------------+------------------+------------------+---------+-----------------+-----------------+ +| Model | Dataset | Batch Size | C5.18xlarge FP32 | C5.18xlarge INT8 | Speedup | FP32 Accuracy | INT8 Accuracy | ++=======================+==========+============+==================+==================+=========+=================+=================+ +| ResNet50 V1 | ImageNet | 128 | 122.02 | 276.72 | 2.27 | 77.21%/93.55% | 76.86%/93.46% | ++-----------------------+----------+------------+------------------+------------------+---------+-----------------+-----------------+ +| MobileNet 1.0 | ImageNet | 128 | 375.33 | 1016.39 | 2.71 | 73.28%/91.22% | 72.85%/90.99% | ++-----------------------+----------+------------+------------------+------------------+---------+-----------------+-----------------+ +| SSD-VGG 300* | VOC | 224 | 21.55 | 31.47 | 1.46 | 77.4 | 77.46 | ++-----------------------+----------+------------+------------------+------------------+---------+-----------------+-----------------+ +| SSD-VGG 512* | VOC | 224 | 7.63 | 11.69 | 1.53 | 78.41 | 78.39 | ++-----------------------+----------+------------+------------------+------------------+---------+-----------------+-----------------+ +| SSD-resnet50_v1 512* | VOC | 224 | 17.81 | 34.55 | 1.94 | 80.21 | 80.16 | ++-----------------------+----------+------------+------------------+------------------+---------+-----------------+-----------------+ +| SSD-mobilenet1.0 512* | VOC | 224 | 31.13 | 48.72 | 1.57 | 75.42 | 75.04 | ++-----------------------+----------+------------+------------------+------------------+---------+-----------------+-----------------+ + +Quantized SSD models are evaluated with ``nms_thresh=0.45``, ``nms_topk=200``. + +Demo usage for SSD +------------------ + +.. code:: bash + + # with Pascal VOC validation dataset saved on disk + python eval_ssd.py --network=vgg16_atrous --quantized --data-shape=300 --batch-size=224 --dataset=voc + +Usage: + +:: + + SYNOPSIS + python eval_ssd.py [-h] [--network NETWORK] [--quantized] + [--data-shape DATA_SHAPE] [--batch-size BATCH_SIZE] + [--dataset DATASET] [--num-workers NUM_WORKERS] + [--num-gpus NUM_GPUS] [--pretrained PRETRAINED] + [--save-prefix SAVE_PREFIX] + + OPTIONS + -h, --help show this help message and exit + --network NETWORK Base network name + --quantized use int8 pretrained model + --data-shape DATA_SHAPE + Input data shape + --batch-size BATCH_SIZE + eval mini-batch size + --dataset DATASET eval dataset. + --num-workers NUM_WORKERS, -j NUM_WORKERS + Number of data workers + --num-gpus NUM_GPUS number of gpus to use. + --pretrained PRETRAINED + Load weights from previously saved parameters. + --save-prefix SAVE_PREFIX + Saving parameter prefix +""" diff --git a/docs/tutorials/index.rst b/docs/tutorials/index.rst index ddc8f2596d..37b760a3ba 100644 --- a/docs/tutorials/index.rst +++ b/docs/tutorials/index.rst @@ -218,6 +218,10 @@ Deployment :title: C++ Inference with GluonCV :link: ../build/examples_deployment/cpp_inference.html + .. card:: + :title: Inference with Quantized Models + :link: ../build/examples_deployment/int8_inference.html + .. toctree:: :hidden: :maxdepth: 2