Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation transfer user stories #156

Open
2 of 13 tasks
dosumis opened this issue Mar 26, 2024 · 5 comments
Open
2 of 13 tasks

Annotation transfer user stories #156

dosumis opened this issue Mar 26, 2024 · 5 comments

Comments

@dosumis
Copy link
Collaborator

dosumis commented Mar 26, 2024

As a biologist/bioinformatician, I want to view annotation transfer from another dataset/taxonomy at the cell or cell-set level in order to understand how it supports current annotation or how it might support changes to the current annotation of the taxonomy I am viewing/editing. To support this, users should be able to view:

  • Annotation transfer at the cell set level - as a table of annotation transfers on any cell set.
  • the degree of overlap between 2 cell sets - one defined in a taxonomy hierarchy, the other defined by annotation transfer to scells. This could take the form of a confusion matrix with either Jaccard scores + filter for confidence or with weighted Jaccard scores (using the confidence score). The former functionality is currently supported by the Annotation Comparison Shiny App
  • Extended Annotations (all columns in an annotation table) on two related cell sets - currently supported by Annotation Comparison Shiny App in cases of annotation transfer on the cell level.

CAS representation:

TDT functionality required:

  • Support loading in Annotation Transfer to single cells from CSV files (could be via configuration rather than strict specification of columns headers, but we could also just be strict about format).
  • Support export to a format that can be used by Annotation Comparison Shiny App (This should be via CAS-Tools)
  • TDT should support loading 2 taxonomies, where the second taxonomy can be retrieved automatically via a PURL.
  • TDT Should support easy comparison of annotations on two cell sets in different taxonomies. This might require a new view.
  • TDT Should support generation and viewing of confusion matrix for any labelsets with overlapping rather than hierarchical annotation. This covers both AT and cases of taxonomies with overlapping cell sets - e.g. Cross-areal vs single area taxonomies of cortical neurons in Jorstad 2023). Confusion matrix should support a choice of Jaccard + support for filtering on confidence OR weighted jaccard incorporating confidence.
    • Implement confusion matrix generation in CAS-Tools
    • Implement UI over confusion matrix generation in TDT. MVP can be static image, but this would be much more powerful if it were interactive - e.g. choosing a spot could prompt display of annotations on the two related cell sets.
  • TDT should link to annotated H5AD files for both taxonomies (to support closer investigation of gene expression etc for nodes related by Annotation transfer.). This can easily be supported in as far as taxonomies include these links. Currently only for data on CxG.
@dosumis
Copy link
Collaborator Author

dosumis commented Mar 27, 2024

TBA: User stories around importing annotation transfer - from MapMyCells & from custom CSV.

@AvolaAmg @hkir-dev - please review

@dosumis
Copy link
Collaborator Author

dosumis commented Apr 2, 2024

CC @jeremymiller

@AvolaAmg
Copy link
Contributor

AvolaAmg commented Apr 3, 2024

Please see a potential user story attached with MapMyCells and custom .csv as examples
let me know how to implement the story. In the future, I can include screenshots of the output files and potentially screenshot of how the annotation transfer would look on the TDT.

User story - biology/bioinformatician who wants to visualise and exploit the annotation transfer on the taxonomy viewed on TDT

As a biologist/bioinformatician I want to look at the annotation transfer to improve the taxonomy I am curating on the Taxonomy Development Tool (TDT). This might help me understand how the annotation transfer implements the annotation present in the taxonomy and how it supports my work of editing cell types/cell sets, adding new cell sets in the taxonomy I am viewing and editing.
To assess how I can use the annotation transfer to edit the taxonomy I should be able to :

  1. Quantify the level of overlap between two cell sets, one provided by the annotation transfer and the other present in the taxonomy hierarchy. This means that I should be able to quantify how many cell types are part of both annotations. To assess this I want to be able to visualise the number of overlap by using a Jaccard scores or a confusion matrix which should be provided by the TDT. To identify the overlap between two cell sets the jaccard scores is used in the Annotation Comparison Shiny App
  2. The extended annotations across two cell sets (the one overlapping in the jaccard scores) which are all the columns for those two cell sets in the annotation table. This would allow me to understand what the two cell sets refer to and to assess their overlap from a biological point of view.

An example annotation transfer that could help me implement the information in my taxonomy is the annotation transfer obtained from mapping my dataset of interest onto the MapMyCell Platform . In TDT, I would export an AnnData file (.h5ad) and upload it to the MapMyCell Platform using the hierarchical mapping algorithm. From the analysis, I would obtain a .csv output file the results would be used to understand how much the cells in the taxonomy in TDT correspond to the ones from the MapMyCell platform.
Another example of annotation transfer that could be used to implement the taxonomy and understand the cell hierarchy in the taxonomy is by using a custom annotation transfer from a pre-anaysed .csv file (i.e. in case of the nhp basal ganglia taxonomy, the AIT115_human_BGplus_mapping_results). I could use the TDT to load the annotation transfer files and quantify/analyse how the annotations corresponds to the one of my taxonomy in order to build appropriate hierarchy.

@jeremymiller
Copy link

I think this thread captures the various use cases surrounding annotation transfer quite nicely:

  • Mapping followed by cell level annotations (e.g., now each cell has both a cluster from this taxonomy and a mapped cluster from another taxonomy).
  • Using annotation comparison (or other) tools to visualize these relationships at the cluster level
  • Using TDT to update cluster annotations based on mapping result + comparison tools (either unidirectionally, e.g., with MapMyCells, or bidirectionally, e.g., using human to match NHP and NHP to match human at the same time)
  • Storing relevant data and metadata from mapping in the taxonomy
  • Having all the tools play nicely together

Let me know if you want anything else from me on this thread.

@dosumis
Copy link
Collaborator Author

dosumis commented Apr 4, 2024

Parking a related idea here before I forget it:

Annotation transfer labelsets should be optional imports - declared with an IRI and stored in the repo of the taxonomy to which annotations have been transferred. We can use a simple formula to roll IRIs and resolve without declaration. @hkir-dev - does this make sense to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: EPICS
Development

No branches or pull requests

3 participants