Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

standardize() fails in some cases #441

Open
etiennebacher opened this issue Jul 1, 2023 · 7 comments · May be fixed by #442
Open

standardize() fails in some cases #441

etiennebacher opened this issue Jul 1, 2023 · 7 comments · May be fixed by #442
Assignees

Comments

@etiennebacher
Copy link
Member

Cf easystats/report#379

library(datawizard)

data(sleep)
d <- data_modify(sleep, group = as.integer(group) - 1L)
d_wide <- data_to_wide(
  d,
  names_from = "group",
  values_from = "extra",
  names_prefix = "group"
)

model1 <- lm(d_wide$group0 - d_wide$group1 ~ 1)
standardize(model1)
#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =

#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =

#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =
#> 
#> Call:
#> lm(formula = d_wide$group0 - d_wide$group1 ~ 1, data = data_std)
#> 
#> Coefficients:
#> (Intercept)  
#>       -1.58

model1 <- lm(group0 - group1 ~ 1, data = d_wide)
standardize(model1)
#> Error in eval(predvars, data, env): objet 'group0' introuvable
#> Error: Unable to refit the model with standardized data.
#>   Try instead to standardize the data (standardize(data)) and refit the
#>   model manually.

Created on 2023-07-01 with reprex v2.0.2

@etiennebacher
Copy link
Member Author

Looks like it comes from these lines:

data_std <- NULL # needed to avoid note
.standardize_models(x,
robust = robust, two_sd = two_sd,
weights = weights,
verbose = verbose,
include_response = include_response,
update_expr = stats::update(x, data = data_std),
...
)

stats::update() fails in this case when data = NULL:

library(datawizard)

data(sleep)
d <- data_modify(sleep, group = as.integer(group) - 1L)
d_wide <- data_to_wide(
  d,
  names_from = "group",
  values_from = "extra",
  names_prefix = "group"
)

model1 <- lm(d_wide$group0 - d_wide$group1 ~ 1)
stats::update(model1, data = NULL)
#> 
#> Call:
#> lm(formula = d_wide$group0 - d_wide$group1 ~ 1, data = NULL)
#> 
#> Coefficients:
#> (Intercept)  
#>       -1.58

model2 <- lm(group0 - group1 ~ 1, data = d_wide)
stats::update(model2, data = NULL)
#> Error in eval(predvars, data, env): objet 'group0' introuvable

@strengejacke I think you wrote this code, do you need how to fix this?

@strengejacke
Copy link
Member

Minimal reprex:

data(iris)
model <- lm(Sepal.Length ~ 1, data = iris)
stats::update(model, data = NULL)
#> Error in eval(predvars, data, env): object 'Sepal.Length' not found

Created on 2023-07-01 with reprex v2.0.2

@strengejacke strengejacke self-assigned this Jul 1, 2023
strengejacke added a commit that referenced this issue Jul 1, 2023
@strengejacke strengejacke linked a pull request Jul 1, 2023 that will close this issue
@strengejacke
Copy link
Member

strengejacke commented Jul 1, 2023

The problem for your particular example is the returned data:

library(datawizard)

data(sleep)
d <- data_modify(sleep, group = as.integer(group) - 1L)
d_wide <- data_to_wide(
  d,
  names_from = "group",
  values_from = "extra",
  names_prefix = "group"
)

model2 <- lm(group0 - group1 ~ 1, data = d_wide)
head(insight::get_data(model2, source = "mf"))
#>   group0 - group1
#> 1            -1.2
#> 2            -2.4
#> 3            -1.3
#> 4            -1.3
#> 5             0.0
#> 6            -1.0

we need the variables group0 and group1.

The PR at least works for these cases:

library(datawizard)

data(iris)
model <- lm(Sepal.Length ~ 1, data = iris)
standardize(model)
#> 
#> Call:
#> lm(formula = Sepal.Length ~ 1, data = data_std)
#> 
#> Coefficients:
#> (Intercept)  
#>  -4.079e-16

data(sleep)
d <- data_modify(sleep, group = as.integer(group) - 1L)
d_wide <- data_to_wide(
  d,
  names_from = "group",
  values_from = "extra",
  names_prefix = "group"
)

model1 <- lm(d_wide$group0 - d_wide$group1 ~ 1)
standardize(model1)
#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =

#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =

#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =

#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =

#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: group0 - group1 ~ 1,
#>   data =
#> 
#> Call:
#> lm(formula = d_wide$group0 - d_wide$group1 ~ 1, data = data_std)
#> 
#> Coefficients:
#> (Intercept)  
#>       -1.58

model2 <- lm(group0 - group1 ~ 1, data = d_wide)
standardize(model2)
#> 
#> Call:
#> lm(formula = group0 - group1 ~ 1, data = data_std)
#> 
#> Coefficients:
#> (Intercept)  
#>   5.266e-17

Created on 2023-07-01 with reprex v2.0.2

@mattansb
Copy link
Member

mattansb commented Jul 2, 2023

Note that if you have a composite variable (as in this example), and you standardize each part separately, you can easily get nonsense results...

$$ Z(x) - Z(z) \neq Z(x - z) $$

@LAhmos
Copy link

LAhmos commented Jul 22, 2024

any workaround this?

@mattansb
Copy link
Member

You would probably want to compute the composite variable beforehand (not in the formula):

library(datawizard)

data(sleep)

d_wide <- sleep |> 
  data_modify(group = as.integer(group) - 1L) |> 
  data_to_wide(
    names_from = "group",
    values_from = "extra",
    names_prefix = "group"
  ) |> 
  data_modify(
    diff = group0 - group1
  )

model1 <- lm(diff ~ 1, data = d_wide)
standardize(model1)
#> 
#> Call:
#> lm(formula = diff ~ 1, data = data_std)
#> 
#> Coefficients:
#> (Intercept)  
#>   1.755e-17 
#>   

@drpradeepharish
Copy link

drpradeepharish commented Nov 20, 2024

@strengejacke

Does the error still persist?

library(tidyverse)    
library(easystats)

    
    Data <- structure(list(`12_month_remission` = c(0, 1, 1, 0, 0, 0, 0, 
    0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 
    1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 
    1), Sex = structure(c(2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 
    1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 
    1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 
    2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), levels = c("Female", "Male"), class = "factor")), row.names = c(NA, 
    -50L), class = c("tbl_df", "tbl", "data.frame"))

glm(`12_month_remission` ~ Sex, family = "binomial", data = Data) %>% 
  report::report() 

Error in eval(predvars, data, env) : object '12_month_remission' not found
Error: Unable to refit the model with standardized data.
Try instead to standardize the data (standardize(data)) and refit the model manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants