Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving extra metadata in signature format #3371

Open
mr-eyes opened this issue Nov 4, 2024 · 6 comments
Open

Saving extra metadata in signature format #3371

mr-eyes opened this issue Nov 4, 2024 · 6 comments

Comments

@mr-eyes
Copy link
Member

mr-eyes commented Nov 4, 2024

In snipe, I wanted to keep the number of sketched bases to assess the sketching efficiency. However, there is no place in the sourmash signature JSON to hold this information so I had to add a custom suffix to the signature name. It would be great if we can add a metadata dict to the sourmash signature.

@bluegenes
Copy link
Contributor

bluegenes commented Nov 4, 2024

@luizirber suggested a generic metadata field, but notes "The danger of generic metadata with key/val is that you can NEVER depend on the value actually being there, so any code using those values need to account for that"

@ccbaumler additionally wants info on whether a sketch is a pangenome sketch
@bluegenes wants to add information on whether or not a sketch is a translated sketch

@ctb
Copy link
Contributor

ctb commented Nov 4, 2024

concern: metadata mayget out of date. could tie metadata to md5hash of original data file or something?

plugins for manipulating metadata would be great!

related issues:

I think one place we talked about mechanisms for this that were unrelated to modifying Signature was here: #2180

@mr-eyes
Copy link
Member Author

mr-eyes commented Nov 4, 2024

I would like to also mention #2985 here.

@ccbaumler
Copy link
Contributor

Pangenome-related metadata could also be the count of genomes that have been compressed into the pangenome. This is an important metric to define the reliability of the pangenome element characterization.

@bluegenes
Copy link
Contributor

#2219

@bluegenes
Copy link
Contributor

@luizirber suggests that the metadata field could be open to users to modify, not used internally in sourmash. If we want to store and use a field internally, we can create actual individual fields (so each is guaranteed to have an entry w/specific meaning).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants