This repository is the official PyTorch implemetation of paper "Annealing-based-Label-Transfer-Learning-for-Open-World-Object-Detection".
NOTE:
- In the
master
branch, we applied our method to the Faster-RCNN framework, and in theow-detr
branch, we applied our method to the same Deformable DETR framework as ow-detr. - If you want to learn more about the disentanglement and the visualization of our approach, please check out the supplementary video.
Our key codes of the RCNN-based and DETR-based model are listed below, respectively:
|
|
- The code of RCNN-based model is built on detectron2 framework. The main structure of the model is set up in the detectron2/modeling/meta_arch/rcnn.py.
- In the forming stage, we set the cfg.OWOD.COOLING = False to place the disentanglement degree
$\lambda = 0$ and form entangled known proposals. In the extending stage, we simply set the cfg.OWOD.COOLING = True to begin the collaborative learning of known and unknown classes.
- python 3.7, cuda 11.1, torch1.10.1
- pip install -r requirements.txt
- pip install -e .
- You can download the data sets from here and follow these steps to configure the path.
- Create folder datasets/VOC2007
- Put Annotations and JPEGImages inside datasets/VOC2007
- Create folder datasets/VOC2007/ImageSets/Main
- Put the content of datasets/OWOD_imagesets inside datasets/VOC2007/ImageSets/Main
We have trained and tested our models on Ubuntu 16.0
, CUDA 11.1
, GCC 5.4
, Python 3.7
conda create -n owdetr python=3.7
conda activate owdetr
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch
pip install -r requirements.txt
cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py
- You can download the data set from here and follow these steps to configure the path. The files should be organized in the following structure:
OW-DETR/
└── data/
└── VOC2007/
└── OWOD/
├── JPEGImages
├── ImageSets
└── Annotations
You can download the pre-trained backbone network models and the best OWOD models trained by ours methods for Task t1-t4 here.
- Download the pre-trained backbone network model.
R-50.pkl
is for faster rcnn framwork anddino_resnet50_pretrain.pth
is for ow-detr framwork. - Set the path of the pretrained model in the configs.
- You can run
train_*.sh
in thescripts
folder by stages, Where_t*_
represents t1-t4 tasks. Scripts without endings (e.g. train_t2.sh) represent the increment process of forming stage. Scripts withft
endings (e.g. train_t2_ft.sh) represent the fine-tuning process of forming stage, and scripts with_extending
endings (e.g. train_t2_extending.sh) represent the extending stage. (Task t1 does not need to be fine-tuned because it has no previously known classes.) - You should run in order such as:
bash scripts/train_t2.sh
bash scripts/train_t2_ft.sh
bash scripts/train_t2_extending.sh
- You can sample run
test_*.sh
inscripts
folder.
If this work helps your research, please consider citing:
@inproceedings{ma2021annealing,
title={Annealing-based Label-Transfer Learning for Open World Object Detection},
author={Ma, Yuqing and Li, Hainan and Zhang, Zhange and Guo, Jinyang and
Zhang, Shanghang and Gong, Ruihao and Liu, Xianglong},
booktitle={CVPR},
year={2023}
}