Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Simplify code of classification #28

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

oesteban
Copy link

  • Also, change the function name and signature to "predict(X)" to make it more similar to scikit-learn.
  • Also, remove runICA and the other function for registration.

Closes # .

Changes proposed in this pull request:

aroma/utils.py Outdated Show resolved Hide resolved
- Also, change the function name and signature to "predict(X)" to make it more similar to scikit-learn.
- Also, remove runICA and the other function for registration.
Copy link
Collaborator

@eurunuela eurunuela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me but should be complemented by an update in the CLI to make sure we have the ICA components. Same with aroma.py.

@oesteban
Copy link
Author

Don't you prefer to go step by step with targeted PRs? We can deal with the CLI once we have a functional prototype.

@oesteban
Copy link
Author

Once again, the CLI is literally the last thing I would care for :D

aroma/utils.py Outdated
Comment on lines 338 to 340
features_df.to_csv(
op.join(out_dir, "classification_overview.txt"), sep="\t", index_label="IC"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this file is no longer written out here, it would be good to write it out in the workflow. We also need a corresponding change in the workflow function, I think.

@eurunuela eurunuela mentioned this pull request Nov 19, 2021
@handwerkerd
Copy link
Member

handwerkerd commented Nov 22, 2021

The core of the component classification code in tedana is that each classification step is its own function. I don't think it's realistic to completely harmonize the two sets of classification codes now, but if you set it up so that each classification decision is modularized, it should be realistic to use the same system for both in the near future.

Comment on lines -126 to -168
def write_metrics(features_df, out_dir, metric_metadata=None):
"""Write out feature/classification information and metadata.

Parameters
----------
features_df : (C x 5) :obj:`pandas.DataFrame`
DataFrame with metric values and classifications.
Must have the following columns: "edge_fract", "csf_fract", "max_RP_corr", "HFC", and
"classification".
out_dir : :obj:`str`
Output directory.
metric_metadata : :obj:`dict` or None, optional
Metric metadata in a dictionary.

Returns
-------
motion_ICs : array_like
Array containing the indices of the components identified as motion components.

Output
------
AROMAnoiseICs.csv : A text file containing the indices of the
components identified as motion components
desc-AROMA_metrics.tsv
desc-AROMA_metrics.json
"""
# Put the indices of motion-classified ICs in a text file (starting with 1)
motion_ICs = features_df["classification"][features_df["classification"] == "rejected"].index
motion_ICs = motion_ICs.values

with open(op.join(out_dir, "AROMAnoiseICs.csv"), "w") as fo:
out_str = ",".join(motion_ICs.astype(str))
fo.write(out_str)

# Create a summary overview of the classification
out_file = op.join(out_dir, "desc-AROMA_metrics.tsv")
features_df.to_csv(out_file, sep="\t", index_label="IC")

if isinstance(metric_metadata, dict):
with open(op.join(out_dir, "desc-AROMA_metrics.json"), "w") as fo:
json.dump(metric_metadata, fo, sort_keys=True, indent=4)

return motion_ICs
Copy link
Collaborator

@eurunuela eurunuela Nov 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to remove this @tsalo @oesteban? I do not remember why it was removed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was very long ago, but it seems to me that the plan would be to write the outputs somewhere else, when we have more clarity on what we want to exactly write out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will create io.py and put it in there.

@eurunuela
Copy link
Collaborator

eurunuela commented Nov 23, 2021

Ok guys, I've made the following changes:

  • The function to save the metrics write_metrics has been moved to io.py.
  • The classification is done in classification.py with a function for each criteria as @handwerkerd mentioned.

What do you think @tsalo @CesarCaballeroGaudes @oesteban?

Edit: no idea why the style check fails.

Copy link
Member

@tsalo tsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only problem is that now we're not tracking why each "bad" component is classified as such. I understand if we dropped "rationale" in favor of a list of tags, as discussed for tedana, but it looks like this information is just completely dropped.

aroma/aroma.py Outdated Show resolved Hide resolved
HYPERPLANE = np.array([-19.9751070082159, 9.95127547670627, 24.8333160239175])


def hfc_criteria(x, thr_hfc=THR_HFC):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def hfc_criteria(x, thr_hfc=THR_HFC):
def hfc_criterion(x, thr_hfc=THR_HFC):

Since it's just one criterion.

Comment on lines +69 to +70
:obj:`pandas.DataFrame`
Features table with additional column "classification".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:obj:`pandas.DataFrame`
Features table with additional column "classification".
:obj:`numpy.ndarray`
Classification (``True`` if the component is a CSF one).

Co-authored-by: Taylor Salo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants