HoughNet-VID: Spatial and temporal voting for video object detection

Official PyTroch implementation of HoughNet for video object detection. More details could be found in the paper:

HoughNet: Integrating near and long-range evidence for visual detection,
Nermin Samet, Samet Hicsonmez, Emre Akbas,
TPAMI, 2022. (arXiv pre-print)

Summary

The original HoughNet applies voting only in the spatial domain - for object detection in still images. We extended this idea to the temporal domain by developing a new method, which takes the difference of features from two frames, and applies spatial and temporal voting using our “temporal voting module” to detect objects. We showed the effectiveness of our method on ILSVRC2015 dataset.

Video Object Detection Results on ILSVRC2015

Method	mAP	mAP_Fast	mAP_Medium	mAP_Slow
HoughNet (single frame baseline)	68.8	45.8	66.1	79.1
HoughNet-VID	73.9	50.4	71.5	82.8

Temporal voting model can be donwloaded here.

Installation

Please refer to installation instructions of HoughNet.

Dataset Preparation

Please download ILSVRC2015 DET and ILSVRC2015 VID datasets from here. Next, please place the data as the following. Alternatively you could also create symlink.

./data/ILSVRC2015/
./data/ILSVRC2015/Annotations/DET
./data/ILSVRC2015/Annotations/VID
./data/ILSVRC2015/Data/DET
./data/ILSVRC2015/Data/VID
./data/ILSVRC2015/ImageSets

We used train and test images from MEGA. You could download the same splits, as ImageSets here.

Evaluation and Training

For evaluation and training please refer to here.

Acknowledgement

This work was supported the Scientific and Technological Research Council of Turkey (TUBITAK) through the project titled "Object Detection in Videos with Deep Neural Networks" (grant number 117E054). The numerical calculations reported in this paper were partially performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources).

License

HoughNet-VID is released under the MIT License (refer to the LICENSE file for details).

Citation

If you find HoughNet-VID useful for your research, please cite our paper as follows.

N. Samet, S. Hicsonmez, E. Akbas, "HoughNet: Integrating near and long-range evidence for visual detection", arXiv, 2021.

BibTeX entry:

@misc{HoughNet2021,
      title={HoughNet: Integrating near and long-range evidence for visual detection}, 
      author={Nermin Samet and Samet Hicsonmez and Emre Akbas},
      year={2021}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
evaluation		evaluation
exp		exp
experiments		experiments
models		models
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HoughNet-VID: Spatial and temporal voting for video object detection

Summary

Video Object Detection Results on ILSVRC2015

Installation

Dataset Preparation

Evaluation and Training

Acknowledgement

License

Citation

About

Releases

Packages

Languages

License

nerminsamet/houghnet-vid

Folders and files

Latest commit

History

Repository files navigation

HoughNet-VID: Spatial and temporal voting for video object detection

Summary

Video Object Detection Results on ILSVRC2015

Installation

Dataset Preparation

Evaluation and Training

Acknowledgement

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages