-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
π Anomalib Pipelines #2005
π Anomalib Pipelines #2005
Conversation
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
|
Signed-off-by: Ashwin Vaidya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took me some time, but managed to do my first round :)
It is looking good so far! I've got some minor comments initially. Will go for another round later
src/anomalib/cli/pipelines.py
Outdated
PIPELINE_REGISTRY: dict[str, Orchestrator] | None = { | ||
"benchmark": Benchmark(), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this fit into a single line?
from anomalib.pipelines.utils import ( | ||
dict_from_namespace, | ||
hide_output, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line?
src/anomalib/pipelines/jobs/base.py
Outdated
|
||
@staticmethod | ||
@abstractmethod | ||
def get_iterator(args: Namespace | None = None) -> Iterator: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we find a more descriptive name? Does this only returns configs each time? If yes, wouldn't iterator be too generic?
result.to_csv(file_path, index=False) | ||
self.logger.info(f"Saved results to {file_path}") | ||
|
||
def _print_tabular_results(self, gathered_result: pd.DataFrame) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be used anywhere else? If so, maybe we could move this to a util function to keep this class cleaner?
log_file = "runs/pipeline.log" | ||
Path(log_file).parent.mkdir(exist_ok=True, parents=True) | ||
logger_file_handler = logging.FileHandler(log_file) | ||
logging.getLogger().addHandler(logger_file_handler) | ||
logging.getLogger().setLevel(logging.DEBUG) | ||
warnings.filterwarnings("ignore") | ||
for logger_name in ["lightning.pytorch", "lightning.fabric", "torchmetrics", "os"]: | ||
logging.getLogger(logger_name).handlers = [logger_file_handler] | ||
format_string = "%(asctime)s - %(name)s - %(levelname)s - %(message)s" | ||
logging.basicConfig(format=format_string, level=logging.DEBUG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to wrap this in a function, like setup_logging
or something similar. It looks a bit messy this way
tools/experimental/README.md
Outdated
@@ -0,0 +1,3 @@ | |||
# Anomalib Experimental | |||
|
|||
These are experimental utilities that are under development. These might change frequently or might even be dropped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add this in a warning section
Co-authored-by: Samet Akcay <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Co-authored-by: Samet Akcay <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
I glanced over the design real quick, seems great π. I'll do a more thorough overview asap, mostly from the viewpoint of the tiled ensemble. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think we're getting there. I like the new Generator design which returns the job instance. My biggest concern with the current design is that the argument parsing is spread across multiple classes, which makes it a bit hard to follow. This may make it a bit intimidating for users to implement their own custom pipeline. Do you think we could simplify this in some way?
Since we want to encourage users to implement their own custom pipelines, we need to make sure that the functionality of the different components is very clear. I think it would be good to add a bit more detail to the docstrings of the classes (Pipeline, Job, Generator, Runner) including some examples to make it easier for the users to follow.
In line with this, I think it would be good to also add a guide to our documentation explaining step by step how to implement a new pipeline.
|
||
|
||
class ParallelRunner(Runner): | ||
"""Run the job in parallel using a process pool.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a bit more descriptive and maybe show some examples
"""Pool execution error should be raised when one or more jobs fail in the pool.""" | ||
|
||
|
||
class ParallelRunner(Runner): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to tell the runner which devices to use? Or does it always distribute the jobs across all available GPUs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parallel runner just creates an execution pool and passes the task_id to the job's run
method. The job is responsible for using this task_id to use the appropriate device.
"""Generate BenchmarkJob.""" | ||
|
||
def __init__(self, accelerator: str) -> None: | ||
self.accelerator = accelerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the terminology here. We use this variable mainly to distinguish between cpu and gpu, but I'm not sure if cpu is technically considered to be an accelerator. Maybe device
would be a more suitable name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally this was called device. I think we discussed on changing this to accelerator to be inline with lightning's terminology. I have no preference here. So, I can rename this once we finalise the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Lightning, accelerator seems to be also used for CPUs
https://lightning.ai/docs/pytorch/stable/extensions/accelerator.html
https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.accelerators.CPUAccelerator.html
src/anomalib/cli/cli.py
Outdated
@@ -288,7 +294,7 @@ def instantiate_classes(self) -> None: | |||
self.model = self._get(self.config_init, "model") | |||
self._configure_optimizers_method_to_model() | |||
self.instantiate_engine() | |||
else: | |||
elif self.config["subcommand"] != "pipeline": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer anomalib benchmark ...
but if someone implements a custom pipeline then I feel they should be able to run it without making changes to the cli. In this case they might have to use anomalib pipeline cutom_pipeline
?
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be an idea to use warnings.warn
in the entrypoint scripts to inform the user whenever they use these features?
Signed-off-by: Ashwin Vaidya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me. My only feedback is the lack of examples in the docstring and documentation. If you would like to add this in a follow-up PR, that is also fine.
I will check this thoroughly in a few hours and leave some feedback. |
For some reason I keep getting |
for runner in runners: | ||
try: | ||
_args = args.get(runner.generator.job_class.name, None) | ||
runner.run(_args) | ||
except Exception: # noqa: PERF203 catch all exception and allow try-catch in loop | ||
logger.exception("An error occurred when running the runner.") | ||
print( | ||
f"There were some errors when running [red]{runner.generator.job_class.name}[/red] with" | ||
f" [green]{runner.__class__.__name__}[/green]." | ||
f" Please check [magenta]{log_file}[/magenta] for more details.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this work if I wanted to take results from runner and use them as the input for the next one.
I see that a single runner collects the results at the end. But in case of tiled ensemble, training can be parallel, and then I'd have just a single serial runner that assembles the data back together. With this design I am not entirely sure how that would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I guess this design falls short π I'll have a look again at your code to see what changes need to be made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I think that having a pipeline, at least for me, would mean that you chain some elements and result of one is input for the next one. I also see no problem of using pipelines recursively (not a very deep recursion). So let's says in case of the tiled ensemble it'd look something like:
Training[Parallel]->merging[serial]->postprocessing(implemented as sub-pipeline consisting of serialRunners each doing one step like threshold, norm ...).
Tiled ensemble stores the data inside a class, that handles storage at different indices. Maybe this pattern of "storage" class can also be used in pipelines, so each runner returns that and it can be customized for each specific usecase.
I checked the code and the design is quite nice. I have just one question that I commented on the relevant part. |
I'll have a look. Even patchcore breaks due to rich progress bar in kcenter-greedy. Maybe we shouldn't merge it till this issue is not solved |
5ff7f10
into
openvinotoolkit:feature/pipelines
* π Anomalib Pipelines (#2005) * Add initial design Signed-off-by: Ashwin Vaidya <[email protected]> * Refactor + add to CLI Signed-off-by: Ashwin Vaidya <[email protected]> * Support grid search on class path Signed-off-by: Ashwin Vaidya <[email protected]> * redirect outputs Signed-off-by: Ashwin Vaidya <[email protected]> * design v2 Signed-off-by: Ashwin Vaidya <[email protected]> * remove commented code Signed-off-by: Ashwin Vaidya <[email protected]> * add dummy experiment Signed-off-by: Ashwin Vaidya <[email protected]> * add config Signed-off-by: Ashwin Vaidya <[email protected]> * Refactor Signed-off-by: Ashwin Vaidya <[email protected]> * Add tests Signed-off-by: Ashwin Vaidya <[email protected]> * Apply suggestions from code review Co-authored-by: Samet Akcay <[email protected]> * address pr comments Signed-off-by: Ashwin Vaidya <[email protected]> * Apply suggestions from code review Co-authored-by: Samet Akcay <[email protected]> * refactor Signed-off-by: Ashwin Vaidya <[email protected]> * Simplify argparse Signed-off-by: Ashwin Vaidya <[email protected]> * modify logger redirect Signed-off-by: Ashwin Vaidya <[email protected]> * update docstrings Signed-off-by: Ashwin Vaidya <[email protected]> --------- Signed-off-by: Ashwin Vaidya <[email protected]> Co-authored-by: Samet Akcay <[email protected]> * π Fix Rich Progress with Patchcore Training (#2062) Add safe track Signed-off-by: Ashwin Vaidya <[email protected]> * [Pipelines] π¨ Intra-stage result passing (#2061) * Add initial design Signed-off-by: Ashwin Vaidya <[email protected]> * Refactor + add to CLI Signed-off-by: Ashwin Vaidya <[email protected]> * Support grid search on class path Signed-off-by: Ashwin Vaidya <[email protected]> * redirect outputs Signed-off-by: Ashwin Vaidya <[email protected]> * design v2 Signed-off-by: Ashwin Vaidya <[email protected]> * remove commented code Signed-off-by: Ashwin Vaidya <[email protected]> * add dummy experiment Signed-off-by: Ashwin Vaidya <[email protected]> * add config Signed-off-by: Ashwin Vaidya <[email protected]> * Refactor Signed-off-by: Ashwin Vaidya <[email protected]> * Add tests Signed-off-by: Ashwin Vaidya <[email protected]> * Apply suggestions from code review Co-authored-by: Samet Akcay <[email protected]> * address pr comments Signed-off-by: Ashwin Vaidya <[email protected]> * Apply suggestions from code review Co-authored-by: Samet Akcay <[email protected]> * refactor Signed-off-by: Ashwin Vaidya <[email protected]> * Simplify argparse Signed-off-by: Ashwin Vaidya <[email protected]> * modify logger redirect Signed-off-by: Ashwin Vaidya <[email protected]> * update docstrings Signed-off-by: Ashwin Vaidya <[email protected]> * Add proposal Signed-off-by: Ashwin Vaidya <[email protected]> --------- Signed-off-by: Ashwin Vaidya <[email protected]> Co-authored-by: Samet Akcay <[email protected]> * Update src/anomalib/pipelines/benchmark/job.py --------- Signed-off-by: Ashwin Vaidya <[email protected]> Co-authored-by: Samet Akcay <[email protected]>
π Description
β¨ Changes
Select what type of change your PR is:
β Checklist
Before you submit your pull request, please make sure you have completed the following steps:
For more information about code review checklists, see the Code Review Checklist.