Non-positive definite matrix when fitting nested GLMM #792

dhalpern · 2024-11-12T19:38:47Z

I am trying to understand some issues I've encountered when trying to fit some models or a psychology experiment on memory where there is a lot of nested structure.
For background, the goal is to estimate the effect of position within a list on the probability of recall. Each subject in the experiment participates in multiple sessions with multiple lists. The following code creates an example dataset:

using MixedModels, DataFrames, Random
Random.seed!(0);
df = allcombinations(DataFrame, "subject" => 1:100, 
    "session" => 1:6, "list" => 1:25, "serialpos" => 1:12)
df[!, :recalled] = rand(nrow(df))
df[!, :recalled] = df[!, :recalled] .> .5

Then to fit the model I use the following code:

serialpos_contrasts = Dict(
    :serialpos => DummyCoding()
)

m = fit(MixedModel, 
    form, 
    df, Bernoulli(), 
    contrasts=serialpos_contrasts; 
    progress=false);

I fit several models, some of which fit without problems and one that has problems. The following all fit fine:
(m1) form => @formula(recalled ~ serialpos + (1 | subject) + (1 | subject & session) + (1 | subject & session & list))
(m2) recalled ~ serialpos + zerocorr(serialpos | subject)
(m3) recalled ~ serialpos + (1 | subject) + (1 | serialpos & subject) + (1 | subject & session)

However, I run into issues when trying to fit the following model, with the subject effects varying and random intercepts by session:
(m4) recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session))
This results in this error:

ERROR: LoadError: PosDefException: matrix is not positive definite; Factorization failed.

I realize this has been partially addressed before (e.g. #349 (comment)) but I'm not really sure I fully understand why (m4) has an error -- (m1) should have more parameters and (m3) should be nearly equivalent to (m4) (although I realize the design matrices are slightly different).
I thought perhaps the correlation structure might be an issue so I tried fitting two models with fulldummy which seems more straight forward.

(m5) recalled ~ serialpos + zerocorr(fulldummy(serialpos) | subject)
(m6) recalled ~ serialpos + zerocorr(fulldummy(serialpos) | subject) + (1 | subject & session)

(m5) fit fine and (m6) ran into the positive definite error.

Is there any good intuition for what is going on here?

The text was updated successfully, but these errors were encountered:

kliegl · 2024-11-12T19:47:53Z

What happens if you include some clustering in the simulation? Or am I missing this in your code?

dhalpern · 2024-11-12T19:55:43Z

No, there is no clustering in the current simulation (recalled is just a Bernoulli RV for every row in the dataframe). Just to clarify, you're suggesting creating a simulation where there is non-zero variability due to subject/session etc?

kliegl · 2024-11-12T19:59:04Z

Right. Your data have nothing to be recovered in the random-effect structure if I understand correctly. So these data do not require a GLMM; they would be fitted with a GLM.

dhalpern · 2024-11-12T20:03:48Z

That is correct -- it still seems strange to me that this error would occur only with very specific design matrices (given that there is no random-effect structure in any of the models) but I will try a simulation with clustering and report back!

dhalpern · 2024-11-12T21:14:03Z

I added the following code to the data generation prior to fitting the model:

sim_m = GeneralizedLinearMixedModel(form, df, Bernoulli(); contrasts=serialpos_contrasts)
df[!, :recalled] = simulate(MersenneTwister(42), sim_m)

My understanding is that the random effect variances are always initialized to 1 and the fixed effects are initialized to small values near 0. In any case, I believe this should simulate data assuming that the model being fitted is correct.
However, when using this code, the model I called (m4) above (recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session)) still resulted in the same non-positive definite error.

dhalpern · 2024-11-12T21:36:52Z

I should note that this error actually occurs during construction of the model object, prior to any model fitting. The following code is enough to produce the error:

using DataFrames, Random, MixedModels
Random.seed!(0);
df = allcombinations(DataFrame, "subject" => 1:25, 
    "session" => 1:6, "list" => 1:15, "serialpos" => 1:12)
df[!, :recalled] = rand(nrow(df))
df[!, :recalled] = df[!, :recalled] .> .5;
serialpos_contrasts = Dict(
    :serialpos => DummyCoding()
)
form = @formula(recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session))
m = GeneralizedLinearMixedModel(form, df, Bernoulli(); contrasts=serialpos_contrasts)

Therefore, it cannot be about the relationship between the response variable and the covariates.

kliegl · 2024-11-12T22:47:16Z

The two formulae are acceptable for linear mixed models. I am out of my depths with GLMM.

lmm =  fit(MixedModel, form, df; contrasts=serialpos_contrasts)

palday · 2024-11-12T23:18:13Z

Therefore, it cannot be about the relationship between the response variable and the covariates.

This isn't quite right -- the failing factorization also occurs during model creation because the factorization is part of the internal representation of a GeneralizedLinearMixedModel. The initial factorization uses the initial parameter estimates, which may not be constraining enough to get reasonable estimates. More precisely, for a dataset, where the ground truth has no variation between levels of the grouping variable (as in the examples up top), the default initial parameter values may be so bad that they are fundamentally incompatible with the data and thus you get the example here.

Since @kliegl pointed out that the LMM fit seems to work, that highlights an experimental feature that we have that starts off with an LMM fit to get better initial starting values for the GLMM fit. You can try passing the keyword argument init_from_lmm=[:β, :θ] to use the LMM as a starting point.

dhalpern · 2024-11-13T16:30:17Z

Thanks so much for all your help in understanding this so far @palday and @kliegl. @palday thanks for the suggestion but (and please correctly if I'm wrong) I'm not sure if that will solve the problem because the error occurs prior to fitting in the constructor for the GeneralizedLinearMixedModel while init_from_lmm is a parameter for fit! . To clarify things, here is the traceback:

Stacktrace:
  [1] cholUnblocked!(A::Matrix{Float64}, ::Type{Val{:L}})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/linalg/cholUnblocked.jl:34
  [2] updateL!(m::LinearMixedModel{Float64})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/linearmixedmodel.jl:1323
  [3] reweight!(m::LinearMixedModel{Float64}, weights::Vector{Float64})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/linearmixedmodel.jl:1033
  [4] deviance!(m::GeneralizedLinearMixedModel{Float64, Bernoulli{Float64}}, nAGQ::Int64)
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:144
  [5] GeneralizedLinearMixedModel(f::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.FunctionTerm{typeof(zerocorr), Vector{StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, tbl::@NamedTuple{subject::Vector{Int64}, session::Vector{Int64}, list::Vector{Int64}, serialpos::Vector{Int64}, recalled::BitVector}, d::Bernoulli{Float64}, l::LogitLink; wts::Vector{Any}, offset::Vector{Any}, contrasts::Dict{Symbol, DummyCoding}, amalgamate::Bool)
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:472
  [6] GeneralizedLinearMixedModel
    @ ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:386 [inlined]
  [7] fit(::Type{GeneralizedLinearMixedModel}, f::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.FunctionTerm{typeof(zerocorr), Vector{StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, tbl::@NamedTuple{subject::Vector{Int64}, session::Vector{Int64}, list::Vector{Int64}, serialpos::Vector{Int64}, recalled::BitVector}, d::Bernoulli{Float64}, l::LogitLink; wts::Vector{Any}, contrasts::Dict{Symbol, DummyCoding}, offset::Vector{Any}, amalgamate::Bool, kwargs::@Kwargs{init_from_lmm::Vector{Symbol}, progress::Bool})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:187
  [8] fit(::Type{GeneralizedLinearMixedModel}, f::StatsModels.FormulaTerm{StatsModels.Term, Tuple{StatsModels.Term, StatsModels.FunctionTerm{typeof(zerocorr), Vector{StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}, tbl::DataFrame, d::Bernoulli{Float64}, l::LogitLink; kwargs::@Kwargs{contrasts::Dict{Symbol, DummyCoding}, init_from_lmm::Vector{Symbol}, progress::Bool})
    @ MixedModels ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:172
  [9] #fit#100
    @ ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:201 [inlined]
 [10] fit
    @ ~/.julia/packages/MixedModels/sKseQ/src/generalizedlinearmixedmodel.jl:193 [inlined]

Specifically the issue seems to be with the deviance calculation here.
At this point I'm wondering what exactly this deviance calculation does. The following code (essentially adapted generalizedlinearmixedmodel.jl) from runs and constructs a GLMM object without issue.

using DataFrames, MixedModels, Random, GLM

Random.seed!(0);
df = allcombinations(DataFrame, "subject" => 1:25, 
    "session" => 1:6, "list" => 1:15, "serialpos" => 1:12)
df[!, :recalled] = rand(nrow(df))
df[!, :recalled] = df[!, :recalled] .> .5

contrasts = Dict(
    :serialpos => DummyCoding()
)
f = @formula(recalled ~ serialpos + zerocorr(serialpos | subject) + (1 | subject & session))
wts = []
d = Bernoulli()
l = canonicallink(d)
tbl = Tables.columntable(df)
amalgamate = true
offset=[]

LMM = LinearMixedModel(f, tbl; contrasts, wts, amalgamate)
y = copy(LMM.y)
constresponse = all(==(first(y)), y)
# the sqrtwts field must be the correct length and type but we don't know those
# until after the model is constructed if wt is empty.  Because a LinearMixedModel
# type is immutable, another one must be created.
if isempty(wts)
    LMM = LinearMixedModel(
        LMM.formula,
        LMM.reterms,
        LMM.Xymat,
        LMM.feterm,
        fill!(similar(y), 1),
        LMM.parmap,
        LMM.dims,
        LMM.A,
        LMM.L,
        LMM.optsum,
    )
end
X = MixedModels.fullrankx(LMM.feterm)
# if the response is constant, there's no point (and this may even fail)
# we allow this instead of simply failing so that a constant response can
# be used as the starting point to simulation where the response will be
# overwritten before fitting
constresponse || updateL!(LMM)
# fit a glm to the fixed-effects only
T = eltype(LMM.Xymat)
# newer versions of GLM (>1.8.0) have a kwarg dropcollinear=true
# which creates problems for the empty fixed-effects case during fitting
# so just don't allow fitting
# XXX unfortunately, this means we have double-rank deficiency detection
# TODO: construct GLM by hand so that we skip collinearity checks
# TODO: extend this so that we never fit a GLM when initializing from LMM
dofit = size(X, 2) != 0 # GLM.jl kwarg
gl = glm(X, y, d, l;
    wts=convert(Vector{T}, wts),
    dofit,
    offset=convert(Vector{T}, offset))
β = dofit ? coef(gl) : T[]
u = [fill(zero(eltype(y)), MixedModels.vsize(t), MixedModels.nlevs(t)) for t in LMM.reterms]
# vv is a template vector used to initialize fields for AGQ
# it is empty unless there is a single random-effects term
vv = length(u) == 1 ? vec(first(u)) : similar(y, 0)

res = GeneralizedLinearMixedModel{T,typeof(d)}(
    LMM,
    β,
    copy(β),
    LMM.θ,
    copy.(u),
    u,
    zero.(u),
    gl.rr,
    similar(y),
    oftype(y, wts),
    similar(vv),
    similar(vv),
    similar(vv),
    similar(vv),
)

Would there be an issue with calling fit! on the res object at this point? Or is the deviance call necessary?

dhalpern · 2024-11-13T19:20:13Z

In the other direction, I've now attempted to produce an example where the data is generated with clustering and the model is "correct"

using DataFrames, Random, Distributions, LogExpFunctions, MixedModels
Random.seed!(0);

n_subjects = 100
n_sessions = 6
n_lists = 25
n_serialpos = 12
df = allcombinations(DataFrame, "subject" => 1:n_subjects, 
    "session" => 1:n_sessions, "list" => 1:n_lists, "serialpos" => 1:n_serialpos)

serialpos_fe_df = DataFrame("serialpos" => 1:n_serialpos,
    "serialpos_fe" => randn(n_serialpos))
sub_re_df = DataFrame("subject" => 1:n_subjects, "sub_re" => randn(n_subjects))
sub_session_re_df = allcombinations(DataFrame, "subject" => 1:n_subjects, 
        "session" => 1:n_sessions)
sub_session_re_df[!, :sub_session_re] = randn(n_subjects * n_sessions)
sub_serialpos_re_df = allcombinations(DataFrame, "subject" => 1:n_subjects, 
        "serialpos" => 1:n_serialpos)
sub_serialpos_re_df[!, :sub_serialpos_re] = randn(n_subjects * n_serialpos)

df = innerjoin(df, serialpos_fe_df, on = :serialpos)
df = innerjoin(df, sub_re_df, on = :subject)
df = innerjoin(df, sub_session_re_df, on = [:subject, :session])
df = innerjoin(df, sub_serialpos_re_df, on = [:subject, :serialpos])

df[!, :linpred] = df[!, :serialpos_fe] + df[!, :sub_re] + df[!, :sub_session_re] + df[!, :sub_serialpos_re]
df[!, :predprob] = logistic.(df[!, :linpred])
df[!, :recalled] = rand.(Bernoulli.(df[!, :predprob]))

serialpos_contrasts = Dict(
    :serialpos => DummyCoding()
)

form = @formula(recalled ~ serialpos + zerocorr(fulldummy(serialpos) | subject) + (1 | subject & session))
m = GeneralizedLinearMixedModel(form, df, Bernoulli(); contrasts=serialpos_contrasts)

This resulted in the same positive-definite matrix error. That being said, I'm still somewhat of a Julia novice so please correct me if I've made any mistakes here!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-positive definite matrix when fitting nested GLMM #792

Non-positive definite matrix when fitting nested GLMM #792

dhalpern commented Nov 12, 2024 •

edited

Loading

kliegl commented Nov 12, 2024

dhalpern commented Nov 12, 2024

kliegl commented Nov 12, 2024 •

edited

Loading

dhalpern commented Nov 12, 2024

dhalpern commented Nov 12, 2024 •

edited

Loading

dhalpern commented Nov 12, 2024

kliegl commented Nov 12, 2024

palday commented Nov 12, 2024

dhalpern commented Nov 13, 2024

dhalpern commented Nov 13, 2024

Non-positive definite matrix when fitting nested GLMM #792

Non-positive definite matrix when fitting nested GLMM #792

Comments

dhalpern commented Nov 12, 2024 • edited Loading

kliegl commented Nov 12, 2024

dhalpern commented Nov 12, 2024

kliegl commented Nov 12, 2024 • edited Loading

dhalpern commented Nov 12, 2024

dhalpern commented Nov 12, 2024 • edited Loading

dhalpern commented Nov 12, 2024

kliegl commented Nov 12, 2024

palday commented Nov 12, 2024

dhalpern commented Nov 13, 2024

dhalpern commented Nov 13, 2024

dhalpern commented Nov 12, 2024 •

edited

Loading

kliegl commented Nov 12, 2024 •

edited

Loading

dhalpern commented Nov 12, 2024 •

edited

Loading