-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC Added SkipMissing wrapper metric #261
base: master
Are you sure you want to change the base?
Conversation
""" | ||
Exclude any missing indices from being included the wrappped distance metric. | ||
""" | ||
struct SkipMissing{D<:PreMetric} <: PreMetric |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there's a benefit to have specific subtypes here? All the metrics I tested seemed to work.
# what the parameters mean. | ||
# NTOE: We call disallowmissings to avoid downstream type promotion issues. | ||
if dist.d isa UnionMetrics | ||
params = parameters(dist.d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, parameters
is only defined for UnionMetrics
?
isnothing(params) ? params : view(params, mask), | ||
) | ||
else | ||
return dist.d( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there's a safer fallback?
@test_throws MethodError pairwise(dist, A, dims=2) | ||
end | ||
|
||
# Handle weights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're just testing that skipmissing
returns the same results as deleting indices missing in either vector.
@test D(a, b) == dist([1, 5], [6, 10]) | ||
end | ||
|
||
@test colwise(D, A, B)[3:4] ≈ colwise(dist, disallowmissing(A[:, 3:4]), disallowmissing(B[:, 3:4])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double check results are the same for non-missing columns.
|
||
M = pairwise(D, A, dims=2) | ||
@test size(M) == (4, 4) | ||
@test !any(ismissing, M) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could manually check the calculations here, but I figured a few smoke tests that it doesn't error or return missing
s was good enough?
Hmm, looks like some of the metrics throw different errors on different Julia versions. |
A potential solution to #11
Pros:
skipmissing
already exists in base, so hopefully this approach is relatively intuitiveCons:
Base.skipmissing
that we shouldn't export and could cause confusion?ntotal / ncomplete
factor inside thesqrt
for euclidean distancesAlternatives: