-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add sparse
argument to step_dummy()
#1392
Conversation
it is set to default to Example helper function below. Currently we don't have the infrastructure yet to determine if the sparse columns produced by library(recipes)
library(modeldata)
recipes_all_sparse <- function(x) {
for (i in seq_along(x$steps)) {
if (!is.null(x$steps[[i]]$sparse)) {
x$steps[[i]]$sparse <- TRUE
}
}
x
}
rec_spec <- recipe(Sale_Price ~ ., data = ames) |>
step_unknown(all_nominal_predictors()) |>
step_impute_mean(all_numeric_predictors()) |>
step_normalize(all_numeric_predictors()) |>
step_dummy(all_nominal_predictors())
rec_spec |>
prep() |>
bake(NULL) |>
lobstr::obj_size()
#> 7.46 MB
rec_spec |>
recipes_all_sparse() |>
prep() |>
bake(NULL) |>
lobstr::obj_size()
#> 1.71 MB Created on 2024-11-12 with reprex v2.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super clean!
No objections for that helper living in recipes rather than workflows.
Good move on scoping this PR as just implementing step_dummy(sparse)
and holding off on tackling the "smart" toggling the argument in the backend. Your thoughts about what that interface might look like sound reasonable!
Co-authored-by: Simon P. Couch <[email protected]>
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue. |
This PR adds the creation of sparse dummy variables in
step_dummy()
via the use of thesparse
argument.