Skip to content

Latest commit

 

History

History
125 lines (92 loc) · 3.58 KB

config.md

File metadata and controls

125 lines (92 loc) · 3.58 KB

Configuration Definitions

We use dataclassbased definition of configuration. More like a taxonomy which can be locally modified and extended easily.

A typical configuration (for one or more experiments) is based on the following definitions:

from dataclasses import dataclass

@dataclass
class Store: # a particular storage instance
    url: str = '.'     # access path
    stype: str = 'file' # storage type: file, nfs, s3, ...
      
@dataclass
class Build: 
    image_url: str
    build_steps: List[str]
    Dockerfile_path: str = None
    image_pull_policy: str = "IfNotPresent"

@dataclass
class Container:
    build: Build
    dirs: ContainerDirs = ContainerDirs()
      
@dataclass
class StorageDirs:
    working_dir: Store       #storage:input -- code lies here
    data_dir: Store           #storage:input -- data lies here
    output_dir: Store       #storage:output -- output artifacts go here

@dataclass
class Run: #options specific to a given run, with some defaults
    cmd: str 										# python train.py --lr {{lr}}
    experiment_name: str 				# 'ex'
    run_name: str = ''					# 'ex-8'
    max_cpu: str = "500m"       # half a core
    max_memory: str ="2Gi"      # 2 GB Memory
    ...													# more fields (see lightex/base_config.py)

@dataclass(frozen=True)
class Resources:
    storage: StorageDirs  # storage paths / urls
    loggers: LoggerConfig # config for multiple loggers
    ctr: Container = None # None if no container used

      
@dataclass
class HP: #all hyperparameters go here, can be nested
    batch_size: int = 4
    epochs: int = 10
    lr: float = 1e-3
    hidden_size: int = 512
    num_layers: int = 8
    # .. more

@dataclass 
class Experiment: #an experiment needs resources, specific hps, and run cmd options
    er: Resources
    hp: HP
    run: Run

The final config definition defines one or more experiments (here only one):

@dataclass
class Config:
    er: Resources
    hp: HP
    run: Run

    def get_experiments(self): #required: generate a list of experiments to run
        expts = [Experiment(er=self.er, hp=self.hp, run=self.run)]
        return expts

Note: all dataclasses are defined to be frozen : @dataclass(frozen=True)

Configuration Instances

Now, create one or more instances of the Config class.

B1 = Build(image_url='efnet:latest', 
            			build_steps=['docker build -t efnet:latest .', 
                    'docker push localhost:32000/efnet:latest'])
Co1 = Container(build=B1)
S1 = StorageDirs(data_dir='/data/imagenette-160/') #defaults: working_dir and output_dir
L = LoggerConfig(names=['mlflow', 'trains'])
R1 = Resources(storage=S1, ctr=Co1, loggers=L) #or ctr=None

H1 = HP(batch_size=16, epochs=10, model_name='efficientnet-b1')

Ru1 = Run(
    cmd="python main.py --data-dir {{run.data_dir}} \
        -a {{hp.model_name}} --pretrained  -b {{hp.batch_size}} \
        -j 0  --epochs {{hp.epochs}} --gpu {{hp.gpu_id}}", 
    experiment_name="efnet0"
    )

C1 = Config(er=R1, hp=H1, run=Ru1)

When dispatching experiments, we refer to config instances by name, e.g., C1.

Now, we can quickly setup another related experiment configuration instance C2.

from dataclasses import replace

H2 = replace(H1, model_name='efficientnet-b2') # batch_size and epochs remain same
C2 = Config(er=R1, hp=H2, run=Ru1)

More examples under this directory.

For now, LightEx assumes the dataclass structure of Experiment and Config but allow flexibility with fields of classes, e.g., HP. Conversion to yaml andjson from a dataclass object is supported.