Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UX of changing ontology IDs #277

Open
cmrn-rhi opened this issue Mar 16, 2022 · 4 comments
Open

UX of changing ontology IDs #277

cmrn-rhi opened this issue Mar 16, 2022 · 4 comments

Comments

@cmrn-rhi
Copy link
Member

cmrn-rhi commented Mar 16, 2022

As we deprecate temporarily minted GENEPIO terms for the preferred domain ontology term, we will run into users having validation/consistency errors if they don't manually update to the new IDs (which they may not realize they have to do).

We need to facilitate updating IDs in a dataset, be it via DataHarmonizer or GEEM, taking advantage of the "term replaced by" operation.

@ddooley
Copy link
Collaborator

ddooley commented Mar 16, 2022

One helpful file in that regard is a separate term deprecation file containing all the deprecated terms and a reference to the replacement term. This can be an import file into the main ontology. This has been setup for FoodOn but not for Genepio yet. Also, we could have a tabular version of this file for easy SQL query or other database update use.

@ddooley
Copy link
Collaborator

ddooley commented Nov 8, 2023

This suggests a feature for dataset management perhaps via dataharmonizer, namely an option to supply dataharmonizer with a list of mappings from deprecated to replaced values, such that dataharmonizer could do the conversion on any given dataset. It suggest a single pooled conversion table resource (assuming everyone agrees on appropriate replacements).

@cmrn-rhi
Copy link
Member Author

cmrn-rhi commented Nov 8, 2023

One helpful file in that regard is a separate term deprecation file containing all the deprecated terms and a reference to the replacement term. This can be an import file into the main ontology. This has been setup for FoodOn but not for Genepio yet. Also, we could have a tabular version of this file for easy SQL query or other database update use.

Yeah, we currently have a deprecation import but it's only for deprecations that were pulled off of ROBOT imports. So if we extract the integrated deprecations then we can just merge and manage them all on a separate deprecation import.

This suggests a feature for dataset management perhaps via dataharmonizer, namely an option to supply dataharmonizer with a list of mappings from deprecated to replaced values, such that dataharmonizer could do the conversion on any given dataset. It suggest a single pooled conversion table resource (assuming everyone agrees on appropriate replacements).

Yes, this would be great. My recollection is so far replacements have been across specifications, we haven't had a case where replacement occurred for one but not another.

@cmrn-rhi
Copy link
Member Author

cmrn-rhi commented Nov 8, 2023

So this is a joint GENEPIO / DataHarmonizer (or pathogen-genomics-package) issue, with the latter being the need for the conversion table. I can add the generation of the conversation table from GENEPIO scripts as part of the specification ontologizing script, but it would be helpful to know what the format of the conversation table should/might be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants