Skip to content

ma-xu/Efficient_ImageNet_Classification

Repository files navigation

Efficient ImageNet Classification

🚀 Training Resnet50 on ImageNet in 8 hours.

This repo provides an efficient implementation of ImageNet classification, based on PyTorch, DALI, and Apex.

If any questions, please create an issue or contact me at [email protected]

Features

  • Accelerate the pre-processing of the input data with DALI
  • Half/Mix precision training with Apex
  • Real-time logger
  • Extremely simple structure

Getting Start

Installation

1. Download repo

git clone https://github.com/13952522076/Efficient_ImageNet_Classification.git
cd Efficient_ImageNet_Classification

2. Requirements

  • Python3.6
  • PyTorch 1.3+
  • CUDA 10+
  • GCC 5.0+
pip install -r requirements.txt

3. Install DALI and Apex

DALI Installation:

cd ~
# For CUDA10
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100
# or
# For CUDA11
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda110

For more details, please see Nvidia DALI installation.

Apex Installation:

cd ~
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

For more details, please see Apex or Apex Full API documentation.

Training & Testing

We provide two training strategies: step_lr schedular and cosine_lr schedular in main_step.py and main_cosine.py respectively.

The training models (last one and best one) and the log file are saved in "checkpoints/imagenet/model_name" by default.


I personally suggest to manually setup the path to imagenet dataset in main_step.py (line 49) and main_cosine.py (line 50). Replace the default value to your real PATH.

Or you can add a parameter --data in the following training command.

For the step learning rate schedular, run follwing commands

# change the parameters accordingly if necessary
# e.g, If you have 4 GPUs, set the nproc_per_node to 4. If you want to train with 32FP, remove ----fp16.
python3 -m torch.distributed.launch --nproc_per_node=8 main_step.py -a old_resnet50 --fp16 --b 32

For the cosine learning rate schedular, run follwing commands

# change the parameters accordingly if necessary
python3 -m torch.distributed.launch --nproc_per_node=8 main_cosine.py -a old_resnet18 --b 64 --opt-level O0

Add New Models

Please follow the same coding style in models/resnet.py.

  1. Add a new model file in folder models
  2. Import the model file in model package, say models/init.py

Calculate Parameters and FLOPs

python3 count_Param.py

🐛: It would not consider the forward operations. For example, defining a pooling layer in init function and implementing the pooling operation in forward function will lead to different results.

Acknowledgements

This implementation is built upon PyTorch ImageNet demo and PytorchInsight.

Many thanks to Xiang Li for his great work.

About

An efficient implementation for ImageNet classification

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages