Implement of Zero-Shot CLIP Classifier #1737

Coobiw · 2023-08-01T07:35:15Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

I implement a version of zero-shot CLIP. With evaluation on CIFAR100 and ImageNet1k, it has a small gap with the official one(OpenAI's). After adding sub-prompt(whose number is 8, 1/10 of full prompt-engineering previously), the performance becomes better.

Modification

I've added the file in mmpretrain.models.multimodal.clip_zs and the config files in configs.clip_zs. Just run:

python tools/test.py configs/clip/clip_vit-base-p16_zeroshot-cls_cifar100.py clip-vit-b-p16_converted.pth

Get the checkpoint

pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git

import clip
model,_ = clip.load('ViT-B/16')
ckpt = model.state_dict()
torch.save(ckpt,'clip-vit-b-p16.pth'

then run the convert script tools/model_converters/openai-clip_to_mmpretrain-clip.py:

python tools/model_converters/openai-clip_to_mmpretrain-clip.py clip-vit-b-p16.pth clip-vit-b-p16_converted.pth

I've uploaded the conveted weights on clip-vit-b-p16_converted.pth and clip-vit-l-p14_converted.pth

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

# ViT-B/16 for CIFAR100
python tools/test.py configs/clip/clip_vit-base-p16_zeroshot-cls_cifar100.py clip-vit-b-p16_converted.pth

# ViT-L/14 for CIFAR100
python tools/test.py configs/clip/clip_vit-large-p14_zeroshot-cls_cifar100.py clip-vit-l-p14_converted.pth

# ViT-B/16 for IN1k
python tools/test.py configs/clip/clip_vit-base-p16_zeroshot-cls_in1k.py clip-vit-b-p16_converted.pth

# ViT-L/14 for IN1k
python tools/test.py configs/clip/clip_vit-large-p14_zeroshot-cls_in1k.py clip-vit-l-p14_converted.pth

The result is shown in following format:

Checklist

Before PR:

Pre-commit or other linting tools are used to fix the potential lint issues.
Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects, like MMDet or MMSeg.
CLA has been signed and all committers have signed the CLA in this PR.

fangyixiao18 · 2023-08-10T05:28:42Z

mmpretrain/models/multimodal/clip_zs/__init__.py

+from ..clip_zs.clip import CLIP_zs
+from ..clip_zs.clip_transformer import CLIPTransformer, CLIPVisionTransformer
+
+__all__ = ['CLIP_zs', 'CLIPTransformer', 'CLIPVisionTransformer']


please rename the class name CLIP_zs to CLIP, please check config and other py file

fangyixiao18 · 2023-08-10T05:39:45Z

configs/clip_zs/clip-vit-base-patch16_cifar100.py

+    num_workers=8,
+    dataset=dict(
+        type='CIFAR100',
+        data_root='/public/DATA/qbw/img_cls_dataset/cifar100',


Suggested change

data_root='/public/DATA/qbw/img_cls_dataset/cifar100',

data_root='data/cifar100',

we use this format path instead of your real path in the released code, please also check other configs

fangyixiao18 · 2023-08-17T02:12:43Z

mmpretrain/models/multimodal/clip_zs/clip.py

+
+@MODELS.register_module()
+class CLIP_zs(BaseModel):
+    """The implementation of `ChineseCLIP <https://arxiv.org/abs/2211.01335>`_.


update docstrings to CLIP and CLIP's link

codecov · 2023-08-22T06:24:08Z

Codecov Report

Patch coverage is 96.00% of modified lines.

❗ Current head d5228d5 differs from pull request most recent head be23c55. Consider uploading reports for the commit be23c55 to get more accurate results

Files Changed	Coverage
configs/_base_/datasets/imagenet_bs128_mbv3.py	`ø`
configs/_base_/datasets/imagenet_bs32.py	`ø`
...onfigs/_base_/datasets/imagenet_bs32_pil_resize.py	`ø`
configs/_base_/datasets/imagenet_bs64_swin_224.py	`ø`
configs/_base_/datasets/imagenet_bs64_swin_384.py	`ø`
configs/hivit/hivit-tiny-p16_16xb64_in1k.py	`50.00%`
configs/_base_/datasets/imagenet_bs64_hivit_224.py	`100.00%`
configs/_base_/models/hivit/tiny_224.py	`100.00%`
...gs/_base_/schedules/imagenet_bs1024_adamw_hivit.py	`100.00%`
configs/dinov2/vit-base-p14_dinov2-pre_headless.py	`100.00%`
... and 1 more

📢 Thoughts on this report? Let us know!.

fangyixiao18 · 2023-08-31T03:15:20Z

mmpretrain/models/multimodal/clip_zs/clip.py

+
+
+@MODELS.register_module()
+class CLIP_zs(CLIP):


Rename to CLIPZeroShot, please also check other parts

fangyixiao18 · 2023-08-31T03:22:57Z

configs/clip_zs/clip-vit-base-patch16_cifar100.py

move all these 4 configs to configs/clip/ folder, and rename them to format like clip_vit-base-p16_zeroshot-cls_cifar100.py, just as chinese_clip

* [CodeCamp2023-584]Support DINO self-supervised learning in project (#1756) * feat: impelemt DINO * chore: delete debug code * chore: impplement pre-commit * fix: fix imported package * chore: pre-commit check * [CodeCamp2023-340] New Version of config Adapting MobileNet Algorithm (#1774) * add new config adapting MobileNetV2,V3 * add base model config for mobile net v3, modified all training configs of mobile net v3 inherit from the base model config * removed directory _base_/models/mobilenet_v3 * [Feature] Implement of Zero-Shot CLIP Classifier (#1737) * zero-shot CLIP * modify zero-shot clip config * add in1k_sub_prompt(8 prompts) for improvement * add some annotations doc * clip base class & clip_zs sub-class * some modifications of details after review * convert into and use mmpretrain-vit * modify names of some files and directories * ram init commit * [Fix] Fix pipeline bug in image retrieval inferencer * [CodeCamp2023-341] 多模态数据集文档补充-COCO Retrieval * Update OFA to compat with latest huggingface. * Update train.py to compat with new config * Bump version to v1.1.0 * Update __init__.py --------- Co-authored-by: LALBJ <[email protected]> Co-authored-by: DE009 <[email protected]> Co-authored-by: mzr1996 <[email protected]> Co-authored-by: 飞飞 <[email protected]>

npurson · 2024-09-01T08:06:14Z

Appreciating your implementation! It really helped me out a lot!

BTW, have you ever considered adding the usage instructions to the README within the clip configs? I struggled to find how to use this feature until I dug through the PRs.

Coobiw added 4 commits August 1, 2023 13:50

zero-shot CLIP

61c9a7d

modify zero-shot clip config

750d8b7

add in1k_sub_prompt(8 prompts) for improvement

e6d980b

add some annotations doc

5b08704

fangyixiao18 requested changes Aug 17, 2023

View reviewed changes

Coobiw added 3 commits August 19, 2023 03:11

clip base class & clip_zs sub-class

e92c1d7

some modifications of details after review

c48b3cd

convert into and use mmpretrain-vit

3bc4442

Coobiw requested a review from fangyixiao18 August 22, 2023 06:12

fangyixiao18 requested changes Aug 31, 2023

View reviewed changes

modify names of some files and directories

be23c55

Coobiw requested a review from fangyixiao18 September 1, 2023 03:07

fangyixiao18 approved these changes Sep 1, 2023

View reviewed changes

fangyixiao18 merged commit bb59c9a into open-mmlab:main Sep 4, 2023
6 of 7 checks passed

Coobiw mentioned this pull request Sep 25, 2023

[Feature] Implement of RAM with a gradio interface #1802

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement of Zero-Shot CLIP Classifier #1737

Implement of Zero-Shot CLIP Classifier #1737

Coobiw commented Aug 1, 2023 •

edited

Loading

fangyixiao18 Aug 10, 2023

fangyixiao18 Aug 10, 2023

fangyixiao18 Aug 17, 2023

codecov bot commented Aug 22, 2023 •

edited

Loading

fangyixiao18 Aug 31, 2023

fangyixiao18 Aug 31, 2023 •

edited

Loading

npurson commented Sep 1, 2024

	data_root='/public/DATA/qbw/img_cls_dataset/cifar100',
	data_root='data/cifar100',

Implement of Zero-Shot CLIP Classifier #1737

Implement of Zero-Shot CLIP Classifier #1737

Conversation

Coobiw commented Aug 1, 2023 • edited Loading

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

fangyixiao18 Aug 10, 2023

Choose a reason for hiding this comment

fangyixiao18 Aug 10, 2023

Choose a reason for hiding this comment

fangyixiao18 Aug 17, 2023

Choose a reason for hiding this comment

codecov bot commented Aug 22, 2023 • edited Loading

Codecov Report

fangyixiao18 Aug 31, 2023

Choose a reason for hiding this comment

fangyixiao18 Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

npurson commented Sep 1, 2024

Coobiw commented Aug 1, 2023 •

edited

Loading

codecov bot commented Aug 22, 2023 •

edited

Loading

fangyixiao18 Aug 31, 2023 •

edited

Loading