You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am training LDAs on one set of trials and testing the decoding performance on a separate set of trials. All of a sudden, my performance dropped to chance and after about a day of digging around, I realised that toindices actually mutates the label names. In other words, when I was decoding the testset by finding the projected mean that each sample was closest to, I was using the original labels for my testing, and so the class assignments were all essentially random.
As a stopgap measure for my pipeline, I defined
MultivariateStats.toindices(label::AbstractVector{T}) where T <:Integer= label
which fixed my issue, but I realise that this is not general solution. In particular, if there are gaps in label, such that maximum(label) !== length(unique(label)), this could also cause problems.
Is there currently an array type that fulfils that criteria?
The text was updated successfully, but these errors were encountered:
I see. That was a problem with previous implementation. The labels and indices were conflated which caused bounds errors if labels weren't properly defined, #187. It looks like the design problem because the LDA model doesn't carry any explicit information about labels. Class centroids relate to an index of a label rather than the label itself. You can use toindices to get a map of labels to indices and use this map get correct class centroid and weight data.
This recent change really threw a wrench into my pipeline:
MultivariateStats.jl/src/lda.jl
Line 522 in bf15ed0
I am training LDAs on one set of trials and testing the decoding performance on a separate set of trials. All of a sudden, my performance dropped to chance and after about a day of digging around, I realised that
toindices
actually mutates the label names. In other words, when I was decoding the testset by finding the projected mean that each sample was closest to, I was using the original labels for my testing, and so the class assignments were all essentially random.As a stopgap measure for my pipeline, I defined
which fixed my issue, but I realise that this is not general solution. In particular, if there are gaps in
label
, such thatmaximum(label) !== length(unique(label))
, this could also cause problems.Is there currently an array type that fulfils that criteria?
The text was updated successfully, but these errors were encountered: