0.4.0 Release Note

Highlights

GluonCV v0.4 added Pose Estimation models, Int8 quantization for intel CPUs, added FPN Faster/Mask-RCNN, wide se/resnext models, and we also included multiple usability improvements.

We highly suggest to use GluonCV 0.4.0 with MXNet>=1.4.0 to avoid some dependency issues. For some specific tasks you may need MXNet nightly build. See https://gluon-cv.mxnet.io/index.html

New Models released in 0.4

Model	Metric	0.4
simple_pose_resnet152_v1b	OKS AP*	74.2
simple_pose_resnet50_v1b	OKS AP*	72.2
ResNext50_32x4d	ImageNet Top-1	79.32
ResNext101_64x4d	ImageNet Top-1	80.69
SE_ResNext101_32x4d	ImageNet Top-1	79.95
SE_ResNext101_64x4d	ImageNet Top-1	81.01
yolo3_mobilenet1.0_coco	COCO mAP	28.6

* Using Ground-Truth person detection results

Int8 Quantization with Intel Deep Learning boost

GluonCV is now integrated with Intel's vector neural network instruction(vnni) to accelerate model inference speed.
Note that you will need a capable Intel Skylake CPU to see proper speed up ratio.

Model	Dataset	Batch Size	C5.18x FP32	C5.18x INT8	Speedup	FP32 Acc	INT8 Acc
resnet50_v1	ImageNet	128	122.02	276.72	2.27	77.21%/93.55%	76.86%/93.46%
mobilenet1.0	ImageNet	128	375.33	1016.39	2.71	73.28%/91.22%	72.85%/90.99%
ssd_300_vgg16_atrous_voc*	VOC	224	21.55	31.47	1.46	77.4	77.46
ssd_512_vgg16_atrous_voc*	VOC	224	7.63	11.69	1.53	78.41	78.39
ssd_512_resnet50_v1_voc*	VOC	224	17.81	34.55	1.94	80.21	80.16
ssd_512_mobilenet1.0_voc*	VOC	224	31.13	48.72	1.57	75.42	75.04

*nms_thresh=0.45, nms_topk=200

Usage of int8 quantized model is identical to standard GluonCV models, simple use suffix _int8.
For example, use resnet50_v1_int8 as int8 quantized version of resnet50_v1.

Pruned ResNet

https://gluon-cv.mxnet.io/model_zoo/classification.html#pruned-resnet

Pruning channels of convolution layers is an very effective way to reduce model redundency which aims to speed up inference without sacrificing significant accuracy. GluonCV 0.4 has included several pruned resnets from original GluonCV SoTA ResNets for ImageNet.

Model	Top-1	Top-5	Hashtag	Speedup (to original ResNet)
resnet18_v1b_0.89	67.2	87.45	54f7742b	2x
resnet50_v1d_0.86	78.02	93.82	a230c33f	1.68x
resnet50_v1d_0.48	74.66	92.34	0d3e69bb	3.3x
resnet50_v1d_0.37	70.71	89.74	9982ae49	5.01x
resnet50_v1d_0.11	63.22	84.79	6a25eece	8.78x
resnet101_v1d_0.76	79.46	94.69	a872796b	1.8x
resnet101_v1d_0.73	78.89	94.48	712fccb1	2.02x

Scripts for pruning resnets will be release in the future.

More GANs(thanks @husonchen)

SRGAN

A GluonCV SRGAN of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network ": https://github.com/dmlc/gluon-cv/tree/master/scripts/gan/srgan

CycleGAN

Image-to-Image translation reproduced in GluonCV: https://github.com/dmlc/gluon-cv/tree/master/scripts/gan/cycle_gan

Residual Attention Network(thanks @PistonY)

GluonCV implementation of https://arxiv.org/abs/1704.06904

New application: Human Pose Estimation

https://gluon-cv.mxnet.io/model_zoo/pose.html

Human Pose Estimation in GluonCV is a complete application set, including model definition, training scripts, useful loss and metric functions. We also included some pre-trained models and usage tutorials.

Model	OKS AP	OKS AP (with flip)
simple_pose_resnet18_v1b	66.3/89.2/73.4	68.4/90.3/75.7
simple_pose_resnet18_v1b	52.8/83.6/57.9	54.5/84.8/60.3
simple_pose_resnet50_v1b	71.0/91.2/78.6	72.2/92.2/79.9
simple_pose_resnet50_v1d	71.6/91.3/78.7	73.3/92.4/80.8
simple_pose_resnet101_v1b	72.4/92.2/79.8	73.7/92.3/81.1
simple_pose_resnet101_v1d	73.0/92.2/80.8	74.2/92.4/82.0
simple_pose_resnet152_v1b	72.4/92.1/79.6	74.2/92.3/82.1
simple_pose_resnet152_v1d	73.4/92.3/80.7	74.6/93.4/82.1
simple_pose_resnet152_v1d	74.8/92.3/82.0	76.1/92.4/83.2

Feature Pyramid Network for Faster/Mask-RCNN

Model	bbox/seg mAP	Caffe bbox/seg
faster_rcnn_fpn_resnet50_v1b_coco	0.384/-	0.379
faster_rcnn_fpn_bn_resnet50_v1b_coco	0.393/-	-
faster_rcnn_fpn_resnet101_v1d_coco	0.412/-	0.398/-
maskrcnn_fpn_resnet50_v1b_coco	0.392/0.353	0.386/0.345
maskrcnn_fpn_resnet101_v1d_coco	0.423/0.377	0.409/0.364

Bug fixes and Improvements

Now all resnet definitions in GluonCV support Synchronized BatchNorm
Now pretrained object detection models support reset_class for reuse partial category knowledge so some task may not need to finetune models anymore: https://gluon-cv.mxnet.io/build/examples_detection/skip_fintune.html#sphx-glr-build-examples-detection-skip-fintune-py
Fix some dataloader issue(need mxnet >= 1.4.0)
Fix some segmentation models that won't hybridize
Fix some detection model random Nan problems (require mxnet latest nightly build, >= 20190315)
Various other minor bug fixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GluonCV toolkit v0.4.0