Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotate genes in HGNC with SO terms #118

Open
6 of 10 tasks
cthoyt opened this issue Sep 16, 2021 · 4 comments
Open
6 of 10 tasks

Annotate genes in HGNC with SO terms #118

cthoyt opened this issue Sep 16, 2021 · 4 comments

Comments

@cthoyt
Copy link
Member

cthoyt commented Sep 16, 2021

Either find or create terms for all HGNC gene locus types that can be used to annotate all genes in HGNC:

So far, the mappings I've made are in here: https://github.com/pyobo/pyobo/blob/dc7b4736f2bbf943084e8f8a95e1293c2717c566/src/pyobo/sources/hgnc.py#L110-L145

Related discussion

With HGNC on twitter:

@genenames I’m mapping locus type annotations to sequence ontology terms. Any chance you’d be able to make these mappings first-party? I could also use some help on a few that remainhttps://t.co/NMEQIr13AO

— Charles Tapley Hoyt (@cthoyt) September 16, 2021

On the OBO Foundry Slack workspace:

https://obo-communitygroup.slack.com/archives/C01BDKWDS91/p1631787773022200

cthoyt added a commit that referenced this issue Nov 12, 2021
cthoyt added a commit that referenced this issue Nov 12, 2021
@cthoyt
Copy link
Member Author

cthoyt commented May 26, 2023

CC @sartweedie

@sartweedie
Copy link

Just to clarify the situation with ‘complex locus constituent’ - this isn’t for genes that encode proteins that are part of complexes but rather complex in the sense of complicated. These are unusual cases where the research community have requested names for parts of complicated loci encoding many alternate isoforms. We think the closest SO term is gene_fragment (SO:0000997).

@sartweedie
Copy link

Readthroughs are another oddity - they really represent transcripts derived from more than one adjacent gene. However, they are often discussed and treated as separate ‘gene’s distinct from the component genes that contribute to the ‘readthrough’ so some have been named separately. SO:0000697 doesn’t work for these. We suggest making a new SO term for these under transcript. I can put in a ticket for this.

@sartweedie
Copy link

SO:0001500 is fine for phenotype I think even though it isn't under gene. All of the HGNC phenotype records have all been withdrawn (though they still appear in our records as withdrawn).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants