Before starting to work on the code-level on GaNDLF, please follow the instructions to install GaNDLF-Synth from sources. Once that's done, please verify the installation using the following command:
# continue from previous shell
(venv_gandlf) $>
# you should be in the "GaNDLF" git repo
(venv_gandlf) $> gandlf-synth verify-install
- The following flowcharts are intended to provide a high-level overview of the different submodules in GaNDLF-Synth.
- Navigate to the
README.md
file in each submodule folder for details. - Some flowcharts are still in development and may not be complete/present.
- Command-line parsing: gandlf run
- Config Manager:
- Handles configuration parsing
- Provides configuration to other modules
- Training Manager:
- Main entry point from CLI
- Handles training functionality
- Inference Manager:
- Handles inference functionality
- Main entry point from CLI
- Performs actual inference
To update/change/add a dependency in setup, please ensure at least the following conditions are met:
- The package is being actively maintained.
- The new dependency is being testing against the minimum python version supported by GaNDLF-Synth (see the
python_requires
variable in setup). - It does not clash with any existing dependencies.
- Create the model in the new file in architectures folder.
- Make sure the new class inherits from ModelBase class.
- Create new
LightningModule
that will implement the training, validation, testing and inference logic. Make sure it inherits from [SynthesisModule] (https://github.com/mlcommons/GaNDLF-Synth/blob/main/gandlf_synth/models/module_abc.py) class and implements necessary abstract methods. - Add the new model to the ModuleFactory
AVAILABE_MODULES
dictionary. Note that the key in this dictionary should follow the naming convention:labeling-paradigm_model-name
, and needs to match the key in the model config factory that we will create in the next steps. - Create the model config file in the configs folder.
- Implement the model configuration class that will parse model's configuration when creating model instance. Make sure it inherits from AbstractModelConfig class.
- Add the new model configuration class to the ModelConfigFactory
AVAILABLE_MODEL_CONFIGS
dictionary. Note that the model config key in this dictionary should follow the naming convention:labeling-paradigm_model-name
, and needs to match the key in the model factory that we created in the previous steps.
GaNDLF-Synth handles dataloading for specific labeling paradigms in separate abstractions. If you wish to modify the dataloading functionality, please refer to the following modules:
- Datasets: Contains the dataset classes for different labeling paradigms.
- Dataset Factories: Contains the factory class that creates the dataset instance based on the configuration.
- Dataloaders Factories: Contains the factory class that creates the dataloader instance based on the configuration.
Remember to add new datasets and dataloaders to the respective factory classes. For some cases, modifications of the training or inference logic may be required to accommodate the new dataloading functionality (see below).
- For changes at the level of single training, validation, or test steps, modify the specific functions of a given module.
- For changes at the level of the entire training loop, modify the Training Manager. The main training loop is handled via
Trainer
class ofPytorch Lightning
- please refer to the Pytorch Lightning documentation for more details.
- For changes at the level of single inference steps, modify the specific functions of a given module. Note that for inference, special dataloaders are used to load the data in the required format.
- For changes at the level of the entire inference loop, modify the Inference Manager. The main inference loop is handled via
Trainer
class ofPytorch Lightning
- please refer to the Pytorch Lightning documentation for more details.
Example: gandlf-synth run
CLI command
- Implement function and wrap it with
@click.command()
+@click.option()
- Add it to
cli_subommands
dict The command would be available undergandlf-synth your-subcommand-name
CLI command.
For any new feature that is configurable via config, please ensure the corresponding option in the "extending" section of this documentation is added, so that others can review/use/extend it as needed.
Once you have made changes to functionality, it is imperative that the unit tests be updated to cover the new code. Please see the full testing suite for details and examples. Note that tests are split into different categories, each having its own file in the aforementioned folder:
test_modules.py
: module-specific teststest_generic.py
: global features testsentrypoints/
: tests for specific CLI commands
Tests are using sample data, which gets downloaded and prepared automatically when you run unit tests. Prepared data is stored at GaNDLF-Synth/testing/data/
automatically the first time test are ran. However, you may want to download & explore data by yourself.
Once you have the virtual environment set up, tests can be run using the following command:
# continue from previous shell
(venv_gandlf) $> pytest --device cuda # can be cuda or cpu, defaults to cpu
Any failures will be reported in the file GaNDLF-Synth/testing/failures.log
.
The code coverage for the unit tests can be obtained by the following command:
# continue from previous shell
(venv_gandlf) $> coverage run -m pytest --device cuda; coverage report -m