Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Providing additional intrinsic evaluation metrices #253

Closed
jaksle opened this issue Jun 6, 2023 · 4 comments
Closed

Providing additional intrinsic evaluation metrices #253

jaksle opened this issue Jun 6, 2023 · 4 comments
Labels

Comments

@jaksle
Copy link
Contributor

jaksle commented Jun 6, 2023

This is more of a question than an issue. I wanted to time series clustering using Clustering.jl and I needed intrinsic evaluation measures in order to determine optimal number of clusters. Clustering.jl offers only silhouettes in this regard so I implemented myself popular ones: Calinski-Harabasz, Davies-Boulden, Xie-Beni indices. Would it be useful if I add them to Clustering.jl?

The formulas are easy so it would not require a lot of work, it's just that first I would like to be sure there is an interest in that.

@nomadbl
Copy link
Contributor

nomadbl commented Jun 23, 2023

Not a maintainer here or anything, but sounds interesting to me 😄

@alyst alyst added the feature label Jun 23, 2023
@alyst
Copy link
Member

alyst commented Jun 23, 2023

Sure, this sounds very useful! One constraint, though, is that we don't want Clustering.jl to be a super-package for anything related to unsupervised learning as that would complicate maintenance and unnecessary bloat user projects.
So new features that would require extra dependencies, esp. outside of github.com/JuliaXXX ecosystem, or features that add a lot of new relatively complex code, should rather be implemented as separate packages.

If these new metrics fit the above criteria, the PR is welcome! :)

@jaksle
Copy link
Contributor Author

jaksle commented Jun 24, 2023

No new dependencies needed, no complex code, these quality indices have definitions simpler than silhouettes. Bloat is not my goal, but the few most popular indices for hard and soft clustering I hope would come in handy.

So great, now I am busy but next month I will try check and adjust my code :)

@alyst
Copy link
Member

alyst commented Jan 23, 2024

Implemented in #257

@alyst alyst closed this as completed Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants