-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customized metrics #196
Comments
Hello, I'm working on the EvoScale BioML hackathon so I'm under time constraints... I have a custom metric that doesn't fit into the two-column specification of your design. it would be great to connect with you to help out with a solution for our submission. Thanks. |
Hi @wconnell , thanks for reaching out. Could you provide some more details on what you're trying to do? Custom metrics is a complex feature we won't be able to implement soon, but maybe I can help rethink the structure of your dataset / benchmark to a format that we can already support. |
hey, thanks for getting back to me @cwognum we are uploading a new dataset called OpenPlasmid. our evaluation looks at how well different plasmid sequence embedding methods reflect the similarity of plasmid feature annotations. so, we basically take the plasmid embeddings of a new method, cluster them, and then compute NMI and ARI relative to labels to quantify expected similarity. any ideas how this can fit into your framework? |
Oeh, that's interesting, but I'm afraid it's not a great fit for the current state of the platform... We've been focused on predictive modeling. However, it sounds like you could ask people to submit the cluster annotations and then compare that against the ground truth clustering. So, for example, your dataset may look like:
The predictions would then e.g. look like: So: from sklearn.metrics.cluster import normalized_mutual_info_score as nmi
from sklearn.metrics import adjusted_rand_score as ari
nmi([0, 0, 1, 1], [1, 1, 0, 2])
# Gives: ~0.800
ari([0, 0, 1, 1], [1, 1, 0, 2])
# Gives: ~0.571 That does mean you don't have any control over the clustering algorithm. I understand that that may not be ideal, but you could make clear in your README how people are supposed to do the clustering and ask people to attach a link to the code when they submit results (Polaris has a dedicated field for this, see here). |
ok, yeah thats prob a sufficient work around for now. thanks for the suggestion! |
hey, figured I'd be back.. realizing that there is not a way to add new metrics for usage with |
Is your feature request related to a problem? Please describe.
New metrics need to be implemented and hard coded in the Polaris client whenever new modality and benchmarks introduce them. This has been the bottleneck for the benchmark creation process.
Describe the solution you'd like
An approach is needed that allows flexible, customized metrics while maintaining the robustness of the Polaris client codebase. Balancing flexibility with stability is key to ensuring that users can easily introduce new metrics without compromising the integrity of the system.
The text was updated successfully, but these errors were encountered: