GitHub - Young-Flash/CLIP_onnx_demo

This demo(based on @greyovo's jupyter notebook) show the inference result for the same text & image input with different model, including the original CLIP and the onnx quantized model. Test result on my local machine is as follows:

model	result
CLIP	[[6.1091479e-02 9.3267566e-01 5.3717378e-03 8.6108845e-04]]
clip-image-encoder.onnx & clip-text-encoder.onnx	[[6.1091259e-02 9.3267584e-01 5.3716768e-03 8.6109847e-04]]
clip-image-encoder-quant-int8.onnx & clip-text-encoder-quant-int8.onnx	[[4.703762e-02 9.391219e-01 9.90335e-03 3.93698e-03]]

The test input text is ["a tiger", "a cat", "a dog", "a bear"] and the test image is as follows:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
image		image
text		text
.gitignore		.gitignore
README.md		README.md
image.jpg		image.jpg
main.py		main.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Young-Flash/CLIP_onnx_demo

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages