-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MetaWorkflow Handler and MetaWorkflow Run Handler: Pipeline Automation #11
base: master
Are you sure you want to change the base?
Conversation
…culated attributes
Some are specific to the structure of CGAP portal schemas, but tried to generalize as much as possible. Also includes partial draft of dependency validation via topological sort.
…ests for topological sort.
Resulted in edits within test files, including creation of a new test file for topological sort, and changes in imports for magma/metawfl_handler.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping some comments for consideration. Happy to review again whenever you'd like, especially once there are tests available for the dependency sorting/checking. I think I understand the gist of it all and it sounds right, but I didn't go in detail; I'll do so once the tests are in.
Changes included additions of different directed graph global vars for testing -- tests for the topological sort on their way. Also removed function for creating "forward dependencies" -- this version of topological sort works with "backward dependencies".
Addressed some prior comments. Future planned changes include creating a class for the topological sort function to decrease the number of command line arguments that are common among several functions.
Also addressed some prior comments on draft.
Changes included merging some former magma utils fxns into this class, and removing extraneous utils fxns. Also includes new test file for this class.
Also wrote a draft of pytests
includes addition of custom Exception classes
…onstants. Got rid of duplication_flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving a batch of comments, some of which will require further discussion. It looks like there's still a good deal to complete, clean up, and test properly, so we should discuss a plan tomorrow for how best to utilize the remaining time.
#!/usr/bin/env python3 | ||
|
||
################################################################# | ||
# Vars | ||
################################################################# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for the shebang or the headers here and elsewhere. I know Michele does this, but I find it unnecessary clutter.
magma/magma_constants.py
Outdated
TITLE = "title" | ||
|
||
# MetaWorkflow Handler attributes | ||
PROJECT = "project" | ||
INSTITUTION = "institution" | ||
UUID = "uuid" | ||
META_WORKFLOWS = "meta_workflows" | ||
ORDERED_META_WORKFLOWS = "ordered_meta_workflows" | ||
META_WORKFLOW = "meta_workflow" | ||
NAME = "name" | ||
DEPENDENCIES = "dependencies" | ||
ITEMS_FOR_CREATION_PROP_TRACE = "items_for_creation_property_trace" | ||
ITEMS_FOR_CREATION_UUID = "items_for_creation_uuid" | ||
|
||
# MetaWorkflow Run Handler attributes | ||
STATUS = "status" | ||
FINAL_STATUS = "final_status" | ||
ASSOCIATED_META_WORKFLOW_HANDLER = "meta_workflow_handler" | ||
ASSOCIATED_ITEM = "associated_item" | ||
META_WORKFLOW_RUN = "meta_workflow_run" | ||
META_WORKFLOW_RUNS = "meta_workflow_runs" | ||
ITEMS_FOR_CREATION = "items_for_creation" | ||
ERROR = "error" | ||
# statuses | ||
PENDING = "pending" | ||
RUNNING = "running" | ||
COMPLETED = "completed" | ||
FAILED = "failed" | ||
STOPPED = "stopped" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to organize the constants into classes so they can be imported in groups. In general, importing *
is not a good practice, and having the constants on classes makes it more clear where they are coming from.
Also, I don't believe all of these constants should be in the magma
directory, as some are portal specific, and those should be differentiated. In general, I'd be fine with all this code living in magma_ff
for now, but that's something to discuss more with Michele.
magma/magma_constants.py
Outdated
#TODO: the following is here in case dup flag is added in the future | ||
# MWFR_TO_HANDLER_STEP_STATUS_DICT = { | ||
# "pending": "pending", | ||
# "running": "running", | ||
# "completed": "completed", | ||
# "failed": "failed", | ||
# "inactive": "pending", | ||
# "stopped": "stopped", | ||
# "quality metric failed": "failed" | ||
# } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still needed regardless of the duplication flag stuff...
Regardless, it is portal-specific and thus shouldn't live in magma
, and there's no need to leave the commented out code around if it's no longer needed.
################################################ | ||
# ValidatedDictionary TODO: eventually make part of dcicutils? | ||
################################################ | ||
class ValidatedDictionary(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming this. It's not a dictionary, but rather a class, so ValidatedAttributes
would probably be more apt.
In general, I would stay away from this style of code of arbitrarily placing all key, value pairs in a dictionary on the class itself. I know it's done elsewhere in this project, but it complicates tracing the origin of attributes to someone new to the code as well as placing an unknown number of attributes on the class that are then capable of being accidentally overwritten. As they say for python, "explicit is better than implicit." To my eyes, a superior alternative to this is to set an attribute on the class via __init__
such as self.properties
, and then placing the input dictionary there and validating and setting other properties or attributes appropriately.
magma/validated_dictionary.py
Outdated
if retrieved_attr is None: | ||
raise AttributeError("attribute %s cannot have value 'None'." % attribute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't an attribute be None
? Would a falsy but non-None
value be acceptable?
magma_ff/checkstatus.py
Outdated
for running_mwfr_step_name in self.mwfr_handler_obj.running_steps(): | ||
|
||
# Get run uuid | ||
run_uuid = self.mwfr_handler_obj.get_step_attr(running_mwfr_step_name, uuid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe uuid
is defined here.
# and convert property trace(s) to uuid(s) | ||
else: | ||
property_traces = getattr(meta_workflow_step, ITEMS_FOR_CREATION_PROP_TRACE, None) | ||
if not isinstance(property_traces, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this has to be a list, no?
item_uuid = make_embed_request( | ||
self.associated_item_identifier, | ||
item_prop_trace | ||
+ ".uuid", # TODO: are we assuming the user will include ".uuid" or @id as part of prop trace? | ||
self.auth_key, | ||
single_item=True, | ||
) | ||
if not item_uuid: | ||
raise MetaWorkflowRunHandlerCreationError( | ||
f"Invalid property trace '{item_prop_trace}' on item with the following ID: {self.associated_item_identifier}" | ||
) | ||
items_for_creation_uuids.append(item_uuid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this logic will work here, as you may receive multiple UUIDs, not just one, and you have to parse the result to obtain any UUIDs returned.
@property | ||
def get_project(self): | ||
"""Retrieves project attribute from the associated item.""" | ||
return self.retrieved_associated_item.get(PROJECT) | ||
|
||
@property | ||
def get_institution(self): | ||
"""Retrieves institution attribute from the associated item.""" | ||
return self.retrieved_associated_item.get(INSTITUTION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a personal preference, consider using get
in a method name for non-attribute/property methods, and using attribute-like names for properties (they live somewhere in-between, but I think of them more like attributes than methods).
No description provided.