Skip to content

Commit

Permalink
Renamed the VGG-style architectures
Browse files Browse the repository at this point in the history
- Renamed VGG5, VGG7, and VGG9 to Conv-2, Conv-4, and Conv-6
  respectively, because these are the names by which they are referred
  to in the original paper
- Renamed the VGG17 architecture to VGG19, because this is what it is
  referred to in the original paper, although it only has 17 weight
  layers instead of 19 like the real VGG19, it still is an adapted VGG19
  with the same number of convolutional layers
  • Loading branch information
lecode-official committed Oct 20, 2022
1 parent c82101b commit 316c9bd
Show file tree
Hide file tree
Showing 7 changed files with 59 additions and 57 deletions.
1 change: 1 addition & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
"cifar",
"CIFAR",
"conda",
"conv",
"Conv",
"convolutional",
"cuda",
Expand Down
10 changes: 5 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@

## v0.1.0

*Released on October 19, 2022*
*Released on October 20, 2022*

- Initial release
- Implements the original lottery ticket hypothesis algorithm using magnitude pruning
- Supports the following models:
- LeNet-300-100
- LeNet-5
- VGG5
- VGG7
- VGG9
- VGG17
- Conv-2
- Conv-4
- Conv-6
- VGG19
- Supports the following datasets:
- MNIST
- CIFAR-10
Expand Down
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ keywords:
- Lottery Ticket Hypothesis
- Pruning
version: 0.1.0
date-released: 2022-10-19
date-released: 2022-10-20
license: MIT
repository: https://github.com/lecode-official/pytorch-lottery-ticket-hypothesis.git
url: https://github.com/lecode-official/pytorch-lottery-ticket-hypothesis
30 changes: 14 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,10 @@ Currently the following models and datasets are supported:

- LeNet-300-100 [[4]](#4) (`lenet-300-100`)
- LeNet-5 [[4]](#4) (`lenet-5`)
- VGG5 [[7]](#7) (`vgg5`)
- VGG7 [[7]](#7) (`vgg7`)
- VGG9 [[7]](#7) (`vgg9`)
- VGG17 [[7]](#7) (`vgg17`)
- Conv-2 [[1]](#1) (`conv-2`)
- Conv-4 [[1]](#1) (`conv-4`)
- Conv-6 [[1]](#1) (`conv-6`)
- VGG19 [[1](#1), [7](#7)] (`vgg19`)

**Datasets:**

Expand Down Expand Up @@ -151,15 +151,13 @@ If you use this software in your research, please cite it like this or use the "

## To-Do's

1. The names of the VGG networks seems to be wrong, they should be renamed
2. General clean up, so that the project can be made public
3. Intelligently retain model checkpoint files
4. Extensively log hyperparameters and training statistics
5. Add support for plotting training statistics
6. Make it possible to gracefully abort the training process
7. Add support for macOS on ARM64
8. Implement the ResNet-18 model
9. Perform extensive experiments on all supported models and datasets and record the results in the read me
10. Make it possible to redo all of the experiments from the original paper
11. Implement the models that were used in the paper
12. Add support for different mask-0 and mask-1 actions
1. Intelligently retain model checkpoint files
2. Extensively log hyperparameters and training statistics
3. Add support for plotting training statistics
4. Make it possible to gracefully abort the training process
5. Add support for macOS on ARM64
6. Implement the ResNet-18 model
7. Perform extensive experiments on all supported models and datasets and record the results in the read me
8. Make it possible to redo all of the experiments from the original paper
9. Add support for different mask-0 and mask-1 actions
10. Make Dropout optional
8 changes: 3 additions & 5 deletions source/lth/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@

import torch

from lth.datasets import BaseDataset


def model_id(new_id: str) -> Callable[[type], type]:
"""A decorator, which adds a model ID to a model class.
Expand Down Expand Up @@ -220,7 +218,7 @@ def get_model_classes() -> list[type]:
module_name = os.path.splitext(os.path.basename(module_path))[0]
model_modules.append(__import__(f'lth.models.{module_name}', fromlist=['']))

# Gets the model classes, which are all the classes in the models module and its sub-modules that inherit from BaseDataset
# Gets the model classes, which are all the classes in the models module and its sub-modules that inherit from BaseModel
model_classes = []
for module in model_modules:
for _, module_class in inspect.getmembers(module, inspect.isclass):
Expand All @@ -246,7 +244,7 @@ def get_model_ids() -> list[str]:
return model_ids


def create_model(id_of_model: str, input_size: tuple, number_of_input_channels: int, number_of_classes: int) -> BaseDataset:
def create_model(id_of_model: str, input_size: tuple, number_of_input_channels: int, number_of_classes: int) -> BaseModel:
"""Creates the model with the specified name.
Args:
Expand All @@ -260,7 +258,7 @@ def create_model(id_of_model: str, input_size: tuple, number_of_input_channels:
ValueError: When the model with the specified name could not be found, an exception is raised.
Returns:
BaseDataset: Returns the model with the specified name.
BaseModel: Returns the model with the specified name.
"""

# Finds the class for the specified model, all models in this module must have a class-level variable containing a model identifier
Expand Down
8 changes: 4 additions & 4 deletions source/lth/models/hyperparameters.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,13 @@ def get_defaults(model_name: str, dataset_name: str, learning_rate: float, batch
default_learning_rate, default_batch_size, default_number_of_epochs = 1.2e-3, 60, 50
elif model_name == 'lenet-5' and dataset_name == 'mnist':
default_learning_rate, default_batch_size, default_number_of_epochs = 1.2e-3, 60, 50
elif model_name == 'vgg5' and dataset_name == 'cifar10':
elif model_name == 'conv-2' and dataset_name == 'cifar10':
default_learning_rate, default_batch_size, default_number_of_epochs = 2e-4, 60, 20
elif model_name == 'vgg7' and dataset_name == 'cifar10':
elif model_name == 'conv-4' and dataset_name == 'cifar10':
default_learning_rate, default_batch_size, default_number_of_epochs = 3e-4, 60, 25
elif model_name == 'vgg9' and dataset_name == 'cifar10':
elif model_name == 'conv-6' and dataset_name == 'cifar10':
default_learning_rate, default_batch_size, default_number_of_epochs = 3e-4, 60, 30
elif model_name == 'vgg17' and dataset_name == 'cifar10':
elif model_name == 'vgg19' and dataset_name == 'cifar10':
default_learning_rate, default_batch_size, default_number_of_epochs = 3e-4, 64, 112

learning_rate = learning_rate if learning_rate is not None else default_learning_rate
Expand Down
57 changes: 31 additions & 26 deletions source/lth/models/vgg.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Represents a module that contains the multiple neural network models based on the VGG family of architectures first introduced by K. Simonyan and
A. Zisserman in their paper "Very Deep Convolutional Networks for Large-Scale Image Recognition". VGG was named after Oxford's renowned Visual
Geometry Group (VGG).
Geometry Group (VGG). The three architectures, referred to as Conv-2, Conv-4, and Conv-6 are scaled down versions for the use with CIFAR-10 and were
introduced by Frankle et al. in their paper "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks".
"""

import torch
Expand All @@ -9,14 +10,15 @@
from . import BaseModel


@model_id('vgg5')
class Vgg5(BaseModel):
"""Represents a very small VGG-variant with only 5 weight layers. In the original paper by Frankle et al., this is referred to as Conv-2 as it has
2 convolutional layers.
@model_id('conv-2')
class Conv2(BaseModel):
"""Represents a VGG-variant scaled down for CIFAR-10 with only 2 convolutional and 3 fully-connected layers, which was introduced by Frankle et
al. in their paper "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks". They refer to this architecture as Conv-2, because
it has 2 convolutional layers.
"""

def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int = 3, number_of_classes: int = 10) -> None:
"""Initializes a new Vgg2 instance.
"""Initializes a new Conv2 instance.
Args:
input_size (tuple, optional): A tuple containing the edge lengths of the input images, which is the input size of the first convolution of
Expand All @@ -31,7 +33,7 @@ def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int =
super().__init__()

# Exposes some information about the model architecture
self.name = 'VGG5'
self.name = 'Conv-2'
self.pruning_rates = {
'convolution_1': 0.1,
'convolution_2': 0.1,
Expand Down Expand Up @@ -98,14 +100,15 @@ def forward(self, x: torch.Tensor) -> torch.Tensor:
return x


@model_id('vgg7')
class Vgg7(BaseModel):
"""Represents a small VGG-variant with only 7 weight layers. In the original paper by Frankle et al., this is referred to as Conv-4, as it has 4
convolutional layers.
@model_id('conv-4')
class Conv4(BaseModel):
"""Represents a VGG-variant scaled down for CIFAR-10 with only 4 convolutional and 3 fully-connected layers, which was introduced by Frankle et
al. in their paper "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks". They refer to this architecture as Conv-4, because
it has 4 convolutional layers.
"""

def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int = 3, number_of_classes: int = 10) -> None:
"""Initializes a new Vgg4 instance.
"""Initializes a new Conv4 instance.
Args:
input_size (tuple, optional): A tuple containing the edge lengths of the input images, which is the input size of the first convolution of
Expand All @@ -120,7 +123,7 @@ def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int =
super().__init__()

# Exposes some information about the model architecture
self.name = 'VGG7'
self.name = 'Conv-4'
self.pruning_rates = {
'convolution_1': 0.1,
'convolution_2': 0.1,
Expand Down Expand Up @@ -213,14 +216,15 @@ def forward(self, x: torch.Tensor) -> torch.Tensor:
return x


@model_id('vgg9')
class Vgg9(BaseModel):
"""Represents a small VGG-variant with only 9 weight layers. In the original paper by Frankle et al., this is referred to as Conv-6, as it has 6
convolutional layers.
@model_id('conv-6')
class Conv6(BaseModel):
"""Represents a VGG-variant scaled down for CIFAR-10 with only 6 convolutional and 3 fully-connected layers, which was introduced by Frankle et
al. in their paper "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks". They refer to this architecture as Conv-6, because
it has 6 convolutional layers.
"""

def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int = 3, number_of_classes: int = 10) -> None:
"""Initializes a new Vgg6 instance.
"""Initializes a new Conv6 instance.
Args:
input_size (tuple, optional): A tuple containing the edge lengths of the input images, which is the input size of the first convolution of
Expand All @@ -235,7 +239,7 @@ def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int =
super().__init__()

# Exposes some information about the model architecture
self.name = 'VGG9'
self.name = 'Conv-6'
self.pruning_rates = {
'convolution_1': 0.15,
'convolution_2': 0.15,
Expand Down Expand Up @@ -354,12 +358,13 @@ def forward(self, x: torch.Tensor) -> torch.Tensor:
return x


@model_id('vgg17')
class Vgg17(BaseModel):
"""Represents a VGG-variant with 17 weight layers. In the original paper by Frankle et al. this is referred to as VGG19, because it is exactly as
VGG19 with the difference, that this version was adapted to CIFAR-10 and is therefore missing 2 fully-connected layers at the end, but it has 16
convolutional layers just as VGG19. Another difference to the original VGG19 is that after the last convolutional layer, an average pooling is
performed instead of max pooling. This is the same as in the original paper by Frankle et al.
@model_id('vgg19')
class Vgg19(BaseModel):
"""Represents a VGG-variant, which was introduced by Frankle et al. in their paper "The Lottery Ticket Hypothesis: Finding Sparse, Trainable
Neural Networks". They refer to this architecture as VGG19, although it is not the same VGG19 architecture first introduced by K. Simonyan and A.
Zisserman in their paper "Very Deep Convolutional Networks for Large-Scale Image Recognition". This version was adapted to CIFAR-10 and is
therefore missing 2 fully-connected layers at the end, but it has the same 16 convolutional layers just as VGG19. Another difference to the
original VGG19 is that after the last convolutional layer, average pooling is performed instead of max pooling.
"""

def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int = 3, number_of_classes: int = 10) -> None:
Expand All @@ -378,7 +383,7 @@ def __init__(self, input_size: tuple = (32, 32), number_of_input_channels: int =
super().__init__()

# Exposes some information about the model architecture
self.name = 'VGG17'
self.name = 'VGG19'
self.pruning_rates = {
'convolution_1': 0.2,
'convolution_2': 0.2,
Expand Down

0 comments on commit 316c9bd

Please sign in to comment.