-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Simplify code of classification #28
base: main
Are you sure you want to change the base?
Conversation
- Also, change the function name and signature to "predict(X)" to make it more similar to scikit-learn. - Also, remove runICA and the other function for registration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me but should be complemented by an update in the CLI to make sure we have the ICA components. Same with aroma.py
.
Don't you prefer to go step by step with targeted PRs? We can deal with the CLI once we have a functional prototype. |
Once again, the CLI is literally the last thing I would care for :D |
aroma/utils.py
Outdated
features_df.to_csv( | ||
op.join(out_dir, "classification_overview.txt"), sep="\t", index_label="IC" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this file is no longer written out here, it would be good to write it out in the workflow. We also need a corresponding change in the workflow function, I think.
The core of the component classification code in tedana is that each classification step is its own function. I don't think it's realistic to completely harmonize the two sets of classification codes now, but if you set it up so that each classification decision is modularized, it should be realistic to use the same system for both in the near future. |
def write_metrics(features_df, out_dir, metric_metadata=None): | ||
"""Write out feature/classification information and metadata. | ||
|
||
Parameters | ||
---------- | ||
features_df : (C x 5) :obj:`pandas.DataFrame` | ||
DataFrame with metric values and classifications. | ||
Must have the following columns: "edge_fract", "csf_fract", "max_RP_corr", "HFC", and | ||
"classification". | ||
out_dir : :obj:`str` | ||
Output directory. | ||
metric_metadata : :obj:`dict` or None, optional | ||
Metric metadata in a dictionary. | ||
|
||
Returns | ||
------- | ||
motion_ICs : array_like | ||
Array containing the indices of the components identified as motion components. | ||
|
||
Output | ||
------ | ||
AROMAnoiseICs.csv : A text file containing the indices of the | ||
components identified as motion components | ||
desc-AROMA_metrics.tsv | ||
desc-AROMA_metrics.json | ||
""" | ||
# Put the indices of motion-classified ICs in a text file (starting with 1) | ||
motion_ICs = features_df["classification"][features_df["classification"] == "rejected"].index | ||
motion_ICs = motion_ICs.values | ||
|
||
with open(op.join(out_dir, "AROMAnoiseICs.csv"), "w") as fo: | ||
out_str = ",".join(motion_ICs.astype(str)) | ||
fo.write(out_str) | ||
|
||
# Create a summary overview of the classification | ||
out_file = op.join(out_dir, "desc-AROMA_metrics.tsv") | ||
features_df.to_csv(out_file, sep="\t", index_label="IC") | ||
|
||
if isinstance(metric_metadata, dict): | ||
with open(op.join(out_dir, "desc-AROMA_metrics.json"), "w") as fo: | ||
json.dump(metric_metadata, fo, sort_keys=True, indent=4) | ||
|
||
return motion_ICs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was very long ago, but it seems to me that the plan would be to write the outputs somewhere else, when we have more clarity on what we want to exactly write out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will create io.py
and put it in there.
Ok guys, I've made the following changes:
What do you think @tsalo @CesarCaballeroGaudes @oesteban? Edit: no idea why the style check fails. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only problem is that now we're not tracking why each "bad" component is classified as such. I understand if we dropped "rationale" in favor of a list of tags, as discussed for tedana, but it looks like this information is just completely dropped.
HYPERPLANE = np.array([-19.9751070082159, 9.95127547670627, 24.8333160239175]) | ||
|
||
|
||
def hfc_criteria(x, thr_hfc=THR_HFC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def hfc_criteria(x, thr_hfc=THR_HFC): | |
def hfc_criterion(x, thr_hfc=THR_HFC): |
Since it's just one criterion.
:obj:`pandas.DataFrame` | ||
Features table with additional column "classification". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:obj:`pandas.DataFrame` | |
Features table with additional column "classification". | |
:obj:`numpy.ndarray` | |
Classification (``True`` if the component is a CSF one). |
Co-authored-by: Taylor Salo <[email protected]>
Closes # .
Changes proposed in this pull request: