Take a quick look at the offcial paper where I have highlighted the key points in it. docs
-
Efficientnet uses the Compound Scaling Method to Scale the depth , width and resolution.
-
The compound scaling method makes sense be- cause if the input image is bigger, then the network needs more layers to increase the receptive field and more channels to capture more fine-grained patterns on the bigger image.
-
compound scaling method, uses a compound coefficient φ to uniformly scales network width, depth, and resolution in a principled way.
Goal is to classify species in camera trap images collected by the Wild Chimpanzee Foundation and the Max Planck Institute for Evolutionary Anthropology. The images include birds, civets, duikers, hogs, leopards, monkeys, rodents, and empty images. Build a model to help researchers identify the species in these images.
These are some images from the challenge. For a more in-depth understanding, look into the problem link
- Efficientnet Pytroch Implementation - Github repo