Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-positive definite matrix when fitting nested GLMM #792

Open
dhalpern opened this issue Nov 12, 2024 · 10 comments
Open

Non-positive definite matrix when fitting nested GLMM #792

dhalpern opened this issue Nov 12, 2024 · 10 comments

Comments

@dhalpern
Copy link

dhalpern commented Nov 12, 2024

I am trying to understand some issues I've encountered when trying to fit some models or a psychology experiment on memory where there is a lot of nested structure.
For background, the goal is to estimate the effect of position within a list on the probability of recall. Each subject in the experiment participates in multiple sessions with multiple lists. The following code creates an example dataset:

using MixedModels, DataFrames, Random
Random.seed!(0);
df = allcombinations(DataFrame, "subject" => 1:100, 
    "session" => 1:6, "list" => 1:25, "serialpos" => 1:12)
df[!, :recalled] = rand(nrow(df))
df[!, :recalled] = df[!, :recalled] .> .5

Then to fit the model I use the following code:

serialpos_contrasts = Dict(
    :serialpos => DummyCoding()
)

m = fit(MixedModel, 
    form, 
    df, Bernoulli(), 
    contrasts=serialpos_contrasts; 
    progress=false);

I fit several models, some of which fit without problems and one that has problems. The following all fit fine:
(m1) form => @formula(recalled ~ serialpos + (1 | subject) + (1 | subject & session) + (1 | subject & session & list))
(m2) recalled ~ serialpos + zerocorr(serialpos | subject)
(m3) recalled ~ serialpos + (1 | subject) + (1 | serialpos & subject) + (1 | subject & session)

However, I run into issues when trying to fit the following model, with the subject effects varying and random intercepts by session:
(m4) recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session))
This results in this error:

ERROR: LoadError: PosDefException: matrix is not positive definite; Factorization failed.

I realize this has been partially addressed before (e.g. #349 (comment)) but I'm not really sure I fully understand why (m4) has an error -- (m1) should have more parameters and (m3) should be nearly equivalent to (m4) (although I realize the design matrices are slightly different).
I thought perhaps the correlation structure might be an issue so I tried fitting two models with fulldummy which seems more straight forward.

(m5) recalled ~ serialpos + zerocorr(fulldummy(serialpos) | subject)
(m6) recalled ~ serialpos + zerocorr(fulldummy(serialpos) | subject) + (1 | subject & session)

(m5) fit fine and (m6) ran into the positive definite error.

Is there any good intuition for what is going on here?

@kliegl
Copy link

kliegl commented Nov 12, 2024

What happens if you include some clustering in the simulation? Or am I missing this in your code?

@dhalpern
Copy link
Author

No, there is no clustering in the current simulation (recalled is just a Bernoulli RV for every row in the dataframe). Just to clarify, you're suggesting creating a simulation where there is non-zero variability due to subject/session etc?

@kliegl
Copy link

kliegl commented Nov 12, 2024

Right. Your data have nothing to be recovered in the random-effect structure if I understand correctly. So these data do not require a GLMM; they would be fitted with a GLM.

@dhalpern
Copy link
Author

That is correct -- it still seems strange to me that this error would occur only with very specific design matrices (given that there is no random-effect structure in any of the models) but I will try a simulation with clustering and report back!

@dhalpern
Copy link
Author

dhalpern commented Nov 12, 2024

I added the following code to the data generation prior to fitting the model:

sim_m = GeneralizedLinearMixedModel(form, df, Bernoulli(); contrasts=serialpos_contrasts)
df[!, :recalled] = simulate(MersenneTwister(42), sim_m)

My understanding is that the random effect variances are always initialized to 1 and the fixed effects are initialized to small values near 0. In any case, I believe this should simulate data assuming that the model being fitted is correct.
However, when using this code, the model I called (m4) above (recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session)) still resulted in the same non-positive definite error.

@dhalpern
Copy link
Author

I should note that this error actually occurs during construction of the model object, prior to any model fitting. The following code is enough to produce the error:

using DataFrames, Random, MixedModels
Random.seed!(0);
df = allcombinations(DataFrame, "subject" => 1:25, 
    "session" => 1:6, "list" => 1:15, "serialpos" => 1:12)
df[!, :recalled] = rand(nrow(df))
df[!, :recalled] = df[!, :recalled] .> .5;
serialpos_contrasts = Dict(
    :serialpos => DummyCoding()
)
form = @formula(recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session))
m = GeneralizedLinearMixedModel(form, df, Bernoulli(); contrasts=serialpos_contrasts)

Therefore, it cannot be about the relationship between the response variable and the covariates.

@kliegl
Copy link

kliegl commented Nov 12, 2024

The two formulae are acceptable for linear mixed models. I am out of my depths with GLMM.

lmm =  fit(MixedModel, form, df; contrasts=serialpos_contrasts)

@palday
Copy link
Member

palday commented Nov 12, 2024

Therefore, it cannot be about the relationship between the response variable and the covariates.

This isn't quite right -- the failing factorization also occurs during model creation because the factorization is part of the internal representation of a GeneralizedLinearMixedModel. The initial factorization uses the initial parameter estimates, which may not be constraining enough to get reasonable estimates. More precisely, for a dataset, where the ground truth has no variation between levels of the grouping variable (as in the examples up top), the default initial parameter values may be so bad that they are fundamentally incompatible with the data and thus you get the example here.

Since @kliegl pointed out that the LMM fit seems to work, that highlights an experimental feature that we have that starts off with an LMM fit to get better initial starting values for the GLMM fit. You can try passing the keyword argument init_from_lmm=[:β, :θ] to use the LMM as a starting point.

@dhalpern
Copy link
Author

Thanks so much for all your help in understanding this so far @palday and @kliegl. @palday thanks for the suggestion but (and please correctly if I'm wrong) I'm not sure if that will solve the problem because the error occurs prior to fitting in the constructor for the GeneralizedLinearMixedModel while init_from_lmm is a parameter for fit! . To clarify things, here is the traceback:

Stacktrace:
  [1] cholUnblocked!(A::Matrix{Float64}, ::Type{Val{:L}})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/linalg/cholUnblocked.jl:34
  [2] updateL!(m::LinearMixedModel{Float64})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/linearmixedmodel.jl:1323
  [3] reweight!(m::LinearMixedModel{Float64}, weights::Vector{Float64})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/linearmixedmodel.jl:1033
  [4] deviance!(m::GeneralizedLinearMixedModel{Float64, Bernoulli{Float64}}, nAGQ::Int64)
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:144
  [5] GeneralizedLinearMixedModel(f::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.FunctionTerm{typeof(zerocorr), Vector{StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, tbl::@NamedTuple{subject::Vector{Int64}, session::Vector{Int64}, list::Vector{Int64}, serialpos::Vector{Int64}, recalled::BitVector}, d::Bernoulli{Float64}, l::LogitLink; wts::Vector{Any}, offset::Vector{Any}, contrasts::Dict{Symbol, DummyCoding}, amalgamate::Bool)
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:472
  [6] GeneralizedLinearMixedModel
    @ ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:386 [inlined]
  [7] fit(::Type{GeneralizedLinearMixedModel}, f::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.FunctionTerm{typeof(zerocorr), Vector{StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, tbl::@NamedTuple{subject::Vector{Int64}, session::Vector{Int64}, list::Vector{Int64}, serialpos::Vector{Int64}, recalled::BitVector}, d::Bernoulli{Float64}, l::LogitLink; wts::Vector{Any}, contrasts::Dict{Symbol, DummyCoding}, offset::Vector{Any}, amalgamate::Bool, kwargs::@Kwargs{init_from_lmm::Vector{Symbol}, progress::Bool})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:187
  [8] fit(::Type{GeneralizedLinearMixedModel}, f::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.FunctionTerm{typeof(zerocorr), Vector{StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, tbl::DataFrame, d::Bernoulli{Float64}, l::LogitLink; kwargs::@Kwargs{contrasts::Dict{Symbol, DummyCoding}, init_from_lmm::Vector{Symbol}, progress::Bool})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:172
  [9] #fit#100
    @ ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:201 [inlined]
 [10] fit
    @ ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:193 [inlined]

Specifically the issue seems to be with the deviance calculation here.
At this point I'm wondering what exactly this deviance calculation does. The following code (essentially adapted generalizedlinearmixedmodel.jl) from runs and constructs a GLMM object without issue.

using DataFrames, MixedModels, Random, GLM

Random.seed!(0);
df = allcombinations(DataFrame, "subject" => 1:25, 
    "session" => 1:6, "list" => 1:15, "serialpos" => 1:12)
df[!, :recalled] = rand(nrow(df))
df[!, :recalled] = df[!, :recalled] .> .5

contrasts = Dict(
    :serialpos => DummyCoding()
)
f = @formula(recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session))
wts = []
d = Bernoulli()
l = canonicallink(d)
tbl = Tables.columntable(df)
amalgamate = true
offset=[]

LMM = LinearMixedModel(f, tbl; contrasts, wts, amalgamate)
y = copy(LMM.y)
constresponse = all(==(first(y)), y)
# the sqrtwts field must be the correct length and type but we don't know those
# until after the model is constructed if wt is empty.  Because a LinearMixedModel
# type is immutable, another one must be created.
if isempty(wts)
    LMM = LinearMixedModel(
        LMM.formula,
        LMM.reterms,
        LMM.Xymat,
        LMM.feterm,
        fill!(similar(y), 1),
        LMM.parmap,
        LMM.dims,
        LMM.A,
        LMM.L,
        LMM.optsum,
    )
end
X = MixedModels.fullrankx(LMM.feterm)
# if the response is constant, there's no point (and this may even fail)
# we allow this instead of simply failing so that a constant response can
# be used as the starting point to simulation where the response will be
# overwritten before fitting
constresponse || updateL!(LMM)
# fit a glm to the fixed-effects only
T = eltype(LMM.Xymat)
# newer versions of GLM (>1.8.0) have a kwarg dropcollinear=true
# which creates problems for the empty fixed-effects case during fitting
# so just don't allow fitting
# XXX unfortunately, this means we have double-rank deficiency detection
# TODO: construct GLM by hand so that we skip collinearity checks
# TODO: extend this so that we never fit a GLM when initializing from LMM
dofit = size(X, 2) != 0 # GLM.jl kwarg
gl = glm(X, y, d, l;
    wts=convert(Vector{T}, wts),
    dofit,
    offset=convert(Vector{T}, offset))
β = dofit ? coef(gl) : T[]
u = [fill(zero(eltype(y)), MixedModels.vsize(t), MixedModels.nlevs(t)) for t in LMM.reterms]
# vv is a template vector used to initialize fields for AGQ
# it is empty unless there is a single random-effects term
vv = length(u) == 1 ? vec(first(u)) : similar(y, 0)

res = GeneralizedLinearMixedModel{T,typeof(d)}(
    LMM,
    β,
    copy(β),
    LMM.θ,
    copy.(u),
    u,
    zero.(u),
    gl.rr,
    similar(y),
    oftype(y, wts),
    similar(vv),
    similar(vv),
    similar(vv),
    similar(vv),
)

Would there be an issue with calling fit! on the res object at this point? Or is the deviance call necessary?

@dhalpern
Copy link
Author

In the other direction, I've now attempted to produce an example where the data is generated with clustering and the model is "correct"

using DataFrames, Random, Distributions, LogExpFunctions, MixedModels
Random.seed!(0);

n_subjects = 100
n_sessions = 6
n_lists = 25
n_serialpos = 12
df = allcombinations(DataFrame, "subject" => 1:n_subjects, 
    "session" => 1:n_sessions, "list" => 1:n_lists, "serialpos" => 1:n_serialpos)

serialpos_fe_df = DataFrame("serialpos" => 1:n_serialpos,
    "serialpos_fe" => randn(n_serialpos))
sub_re_df = DataFrame("subject" => 1:n_subjects, "sub_re" => randn(n_subjects))
sub_session_re_df = allcombinations(DataFrame, "subject" => 1:n_subjects, 
        "session" => 1:n_sessions)
sub_session_re_df[!, :sub_session_re] = randn(n_subjects * n_sessions)
sub_serialpos_re_df = allcombinations(DataFrame, "subject" => 1:n_subjects, 
        "serialpos" => 1:n_serialpos)
sub_serialpos_re_df[!, :sub_serialpos_re] = randn(n_subjects * n_serialpos)

df = innerjoin(df, serialpos_fe_df, on = :serialpos)
df = innerjoin(df, sub_re_df, on = :subject)
df = innerjoin(df, sub_session_re_df, on = [:subject, :session])
df = innerjoin(df, sub_serialpos_re_df, on = [:subject, :serialpos])

df[!, :linpred] = df[!, :serialpos_fe] + df[!, :sub_re] + df[!, :sub_session_re] + df[!, :sub_serialpos_re]
df[!, :predprob] = logistic.(df[!, :linpred])
df[!, :recalled] = rand.(Bernoulli.(df[!, :predprob]))

serialpos_contrasts = Dict(
    :serialpos => DummyCoding()
)

form = @formula(recalled ~ serialpos + zerocorr(fulldummy(serialpos) | subject) + (1 | subject & session))
m = GeneralizedLinearMixedModel(form, df, Bernoulli(); contrasts=serialpos_contrasts)

This resulted in the same positive-definite matrix error. That being said, I'm still somewhat of a Julia novice so please correct me if I've made any mistakes here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants