Image-Encoders-for-GP-NNS

This is the official reposotory for the ICMR23 paper Improving Image Encoders for General-Purpose Nearest Neighbor Search and Classification

This paper investigates the effectiveness of different vision foundation models on two challenging nearest neighbor search-based tasks: zero-shot retrieval and k-NN classification. A benchmark for evaluating the performance of various vision encoders and their pre-training methods is established, where significant differences in the performance of these models are observed. Additionally, we propose a fine-tuning regime that improves zero-shot retrieval and k-NN classification through training with a combination of large publicly available datasets without specializing in any data domain. Our results show that the retrained vision encoders have a higher degree of generalization across different search-based tasks and can be used as general-purpose embedding models for image retrieval.

CLIP pre-trained text-to-image models perform best in this benchmark. Their performance can further be significantly increased trough a general-purpose retrieval specific fine-tuning:

Setup

We recommand to use Anaconda and use the following commands in a console to setup a virtual environment and install the nessesary dependencies.

conda create -n gpret  
conda activate gpret  
conda install notebook pytorch::pytorch=1.13.1 pytorch::torchvision=0.14.1  
git clone https://github.com/Visual-Computing/Image-Encoders-for-GP-NNS.git  
cd Image-Encoders-for-GP-NNS/  
jupyter notebook

Model Checkpoints

Base Model	GFLOPs/image	Fine-tuning Loss	Average Benchmark Score	Link
CLIP ViT-B/16@224	~20	ArcMargin	67.9	checkpoint
CLIP ViT-L/14@336	~180	ArcMargin	77.3	checkpoint

Using our models

An example on how to use our finetuned models for inference is shown in this notebook. Download the checkpoints and follow the Setup above, then open the "ipynb" file in the "notebook" directory using the Jupyter Notebook webseite. Edit the code in the notebook and run the code cells.

Evaluation

Coming Soon

Fine-tuning

Coming Soon

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
notebooks		notebooks
src/models		src/models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-Encoders-for-GP-NNS

Setup

Model Checkpoints

Using our models

Evaluation

Fine-tuning

About

Releases

Packages

Contributors 2

Languages

License

Visual-Computing/Image-Encoders-for-GP-NNS

Folders and files

Latest commit

History

Repository files navigation

Image-Encoders-for-GP-NNS

Setup

Model Checkpoints

Using our models

Evaluation

Fine-tuning

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages