Gluon CV Toolkit 0.3.0
Pre-release0.3 Release Note
Highlights
Added 5 new algorithms and updated 38 pre-trained models with improved accuracy
Compare 7 selected models
Model | Metric | 0.2 | 0.3 | Reference |
---|---|---|---|---|
ResNet-50 | top-1 acc on ImageNet | 77.07% | 79.15% | 75.3% (Caffe impl) |
ResNet-101 | top-1 acc on ImageNet | 78.81% | 80.51% | 76.4% (Caffe impl) |
MobileNet 1.0 | top-1 acc on ImageNet | N/A | 73.28% | 70.9% (tensorflow impl) |
Faster-RCNN | mAP on COCO | N/A | 40.1% | 39.6% (Detectron) |
Yolo-v3 | mAP on COCO | N/A | 37.0% | 33.0% (paper) |
DeepLab-v3 | mIoU on VOC | N/A | 86.7% | 85.7% (paper) |
Mask-RCNN | mask AP on COCO | N/A | 33.1% | 32.8% (Detectron) |
Interactive visualizations for pre-trained models
For image classification:
and for object detection
Deploy without Python
All models are hybridiziable. They can be deployed without Python. See tutorials to deploy these models in C++.
New Models with Training Scripts
DenseNet, DarkNet, SqueezeNet for image classification
We now provide a broader range of model families that are good for out of box usage and various research purposes.
YoloV3 for object detection
Significantly more accurate than original paper. For example, we get 37.0% mAP on CoCo versus the original paper's 33.0%. The techniques we used will be included in a paper to be released later.
Mask-RCNN for instance segmentation
Accuracy now matches Caffe2 Detectron without FPN, e.g. 38.3% box AP and 33.1% mask AP on COCO with ResNet50.
FPN support will come in future versions.
DeepLabV3 for semantic segmentation.
Slightly more accurate than original paper. For example, we get 86.7% mIoU on voc versus the original paper's 85.7%.
WGAN
Reproduced WGAN with ResNet
Person Re-identification
Provide a baseline model which achieved 93.1 best rank1 score on Market1501 dataset.
Enhanced Models with Better Accuracy
Faster R-CNN
- Improved Pascal VOC model accuracy. mAP improves to 78.3% from previous version's 77.9%. VOC models with 80%+ mAP will be released with the tech paper.
- Added models trained on COCO dataset.
- Now Resnet50 model achieves 37.0 mAP, out-performs Caffe2 Detectron without FPN (36.5 mAP).
- Resnet101 model achieves 40.1 mAP, out-performs Caffe2 Detectron with FPN(39.8 mAP)
- FPN support will come in future versions.
ResNet, MobileNet, DarkNet, Inception for image classifcation
- Significantly improved accuracy for some models. For example, ResNet50_v1b gets 78.3% versus previous version's ResNet50_v1b's 77.07%.
- Added models trained with mixup and distillation. For example, ResNet50_v1d has 3 versions: ResNet50_v1d_distill (78.67%), ResNet50_v1d_mixup (79.16%), ResNet50_v1d_mixup_distill (79.29%).
Semantic Segmentation
- Synchronized Batch Normalization training.
- Added Cityscapes dataset and pretrained models.
- Added training details for reproducing state-of-the-art on Pascal VOC and Provided COCO pre-trained models for VOC.
Dependency
GluonCV 0.3.0 now depends on incubator-mxnet >= 1.3.0, please update mxnet according to installation guide to avoid compatibility issues.