Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement of Zero-Shot CLIP Classifier #1737

Merged
merged 8 commits into from
Sep 4, 2023
Merged

Conversation

Coobiw
Copy link
Contributor

@Coobiw Coobiw commented Aug 1, 2023

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

I implement a version of zero-shot CLIP. With evaluation on CIFAR100 and ImageNet1k, it has a small gap with the official one(OpenAI's). After adding sub-prompt(whose number is 8, 1/10 of full prompt-engineering previously), the performance becomes better.

Modification

I've added the file in mmpretrain.models.multimodal.clip_zs and the config files in configs.clip_zs. Just run:

python tools/test.py configs/clip/clip_vit-base-p16_zeroshot-cls_cifar100.py clip-vit-b-p16_converted.pth

Get the checkpoint

pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git

import clip
model,_ = clip.load('ViT-B/16')
ckpt = model.state_dict()
torch.save(ckpt,'clip-vit-b-p16.pth'

then run the convert script tools/model_converters/openai-clip_to_mmpretrain-clip.py:

python tools/model_converters/openai-clip_to_mmpretrain-clip.py clip-vit-b-p16.pth clip-vit-b-p16_converted.pth

I've uploaded the conveted weights on clip-vit-b-p16_converted.pth and clip-vit-l-p14_converted.pth

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

# ViT-B/16 for CIFAR100
python tools/test.py configs/clip/clip_vit-base-p16_zeroshot-cls_cifar100.py clip-vit-b-p16_converted.pth

# ViT-L/14 for CIFAR100
python tools/test.py configs/clip/clip_vit-large-p14_zeroshot-cls_cifar100.py clip-vit-l-p14_converted.pth

# ViT-B/16 for IN1k
python tools/test.py configs/clip/clip_vit-base-p16_zeroshot-cls_in1k.py clip-vit-b-p16_converted.pth

# ViT-L/14 for IN1k
python tools/test.py configs/clip/clip_vit-large-p14_zeroshot-cls_in1k.py clip-vit-l-p14_converted.pth

The result is shown in following format:
image

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects, like MMDet or MMSeg.
  • CLA has been signed and all committers have signed the CLA in this PR.

from ..clip_zs.clip import CLIP_zs
from ..clip_zs.clip_transformer import CLIPTransformer, CLIPVisionTransformer

__all__ = ['CLIP_zs', 'CLIPTransformer', 'CLIPVisionTransformer']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rename the class name CLIP_zs to CLIP, please check config and other py file

num_workers=8,
dataset=dict(
type='CIFAR100',
data_root='/public/DATA/qbw/img_cls_dataset/cifar100',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data_root='/public/DATA/qbw/img_cls_dataset/cifar100',
data_root='data/cifar100',

we use this format path instead of your real path in the released code, please also check other configs


@MODELS.register_module()
class CLIP_zs(BaseModel):
"""The implementation of `ChineseCLIP <https://arxiv.org/abs/2211.01335>`_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstrings to CLIP and CLIP's link

@codecov
Copy link

codecov bot commented Aug 22, 2023



@MODELS.register_module()
class CLIP_zs(CLIP):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to CLIPZeroShot, please also check other parts

Copy link
Collaborator

@fangyixiao18 fangyixiao18 Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move all these 4 configs to configs/clip/ folder, and rename them to format like clip_vit-base-p16_zeroshot-cls_cifar100.py, just as chinese_clip

@fangyixiao18 fangyixiao18 merged commit bb59c9a into open-mmlab:main Sep 4, 2023
6 of 7 checks passed
mzr1996 added a commit that referenced this pull request Oct 25, 2023
* [CodeCamp2023-584]Support DINO self-supervised learning in project (#1756)

* feat: impelemt DINO

* chore: delete debug code

* chore: impplement pre-commit

* fix: fix imported package

* chore: pre-commit check

* [CodeCamp2023-340] New Version of config Adapting MobileNet Algorithm (#1774)

* add new config adapting MobileNetV2,V3

* add base model config for mobile net v3, modified all training configs of mobile net v3 inherit from the base model config

* removed directory _base_/models/mobilenet_v3

* [Feature] Implement of Zero-Shot CLIP Classifier (#1737)

* zero-shot CLIP

* modify zero-shot clip config

* add in1k_sub_prompt(8 prompts) for improvement

* add some annotations doc

* clip base class & clip_zs sub-class

* some modifications of details after review

* convert into and use mmpretrain-vit

* modify names of some files and directories

* ram init commit

* [Fix] Fix pipeline bug in image retrieval inferencer

* [CodeCamp2023-341] 多模态数据集文档补充-COCO Retrieval

* Update OFA to compat with latest huggingface.

* Update train.py to compat with new config

* Bump version to v1.1.0

* Update __init__.py

---------

Co-authored-by: LALBJ <[email protected]>
Co-authored-by: DE009 <[email protected]>
Co-authored-by: mzr1996 <[email protected]>
Co-authored-by: 飞飞 <[email protected]>
@npurson
Copy link

npurson commented Sep 1, 2024

Appreciating your implementation! It really helped me out a lot!

BTW, have you ever considered adding the usage instructions to the README within the clip configs? I struggled to find how to use this feature until I dug through the PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants