Skip to content

Latest commit

 

History

History
41 lines (32 loc) · 1.14 KB

README.md

File metadata and controls

41 lines (32 loc) · 1.14 KB

Bottleneck Transformers for Visual Recognition

Update 2021/03/14

  • support Multi-head Attention

Experiments

Model heads Params (M) Acc (%)
ResNet50 baseline (ref) 23.5M 93.62
BoTNet-50 1 18.8M 95.11%
BoTNet-50 4 18.8M 95.78%
BoTNet-S1-50 1 18.8M 95.67%
BoTNet-S1-59 1 27.5M 95.98%
BoTNet-S1-77 1 44.9M wip

Summary

스크린샷 2021-01-28 오후 4 50 19

Usage (example)

  • Model
from model import Model

model = ResNet50(num_classes=1000, resolution=(224, 224))
x = torch.randn([2, 3, 224, 224])
print(model(x).size())
  • Module
from model import MHSA

resolution = 14
mhsa = MHSA(planes, width=resolution, height=resolution)

Reference

  • Paper link
  • Author: Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani
  • Organization: UC Berkeley, Google Research