Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rio export fails for duration variable with label attribute, haven-export works #433

Open
hsbizahlenzauber opened this issue Jun 17, 2024 · 4 comments

Comments

@hsbizahlenzauber
Copy link

MWE

rio export fails for duration variable with label attribute, haven-export works

library("lubridate")
library("rio")
library("haven")

#creating a variable of class period

duration <- seconds_to_period(duration(hours=26000))

#creating data frame

data <- data.frame(duration)

#setting variable label as attribute
#without this attribute, everything works as expected
attr(data$duration, "label") <- "Duration"

#export with haven
#no error message
haven::write_sav(data, path="testHAVEN.sav")

#export with rio
#Error in new_labelled():
#! x must be a numeric or a character vector.
rio::export(data, file="testRIO.sav")

@chainsawriot
Copy link
Collaborator

@hsbizahlenzauber Thank you for reporting this. I can reproduce this.

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(rio)

duration <- seconds_to_period(duration(hours = 26000))

#creating data frame

duration_data <- data.frame(duration)
attr(duration_data$duration, "label") <- "Duration"
haven::write_sav(duration_data, path = tempfile(fileext = ".sav"))

rio::export(duration_data, file = tempfile(fileext = ".sav"))
#> Error in `new_labelled()`:
#> ! `x` must be a numeric or a character vector.

Created on 2024-06-20 with reprex v2.1.0

And this is where it fails.

restore_labelled <- function(x) {
# restore labelled variable classes
x[] <- lapply(x, function(v) {
if (is.factor(v)) {
haven::labelled(
x = as.numeric(v),
labels = stats::setNames(seq_along(levels(v)), levels(v)),
label = attr(v, "label", exact = TRUE)
)
} else if (!is.null(attr(v, "labels", exact = TRUE)) || !is.null(attr(v, "label", exact = TRUE))) {
haven::labelled(
x = v,
labels = attr(v, "labels", exact = TRUE),
label = attr(v, "label", exact = TRUE)
)
} else {
v
}
})
x
}

TBH, I have to study this bunch of code to derive a solution, because I am not super familiar with these variable labeling code. It might take some time.

@chainsawriot
Copy link
Collaborator

chainsawriot commented Jun 20, 2024

755cdec #268

@chainsawriot
Copy link
Collaborator

@hsbizahlenzauber You know that haven (the package under the hood) does not actually support the Period class of lubridate, right?

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(rio)

duration <- seconds_to_period(duration(hours = 26000))

#creating data frame

duration_data <- data.frame(duration)
duration_data
#>         duration
#> 1 1083d 8H 0M 0S
attr(duration_data$duration, "label") <- "Duration"
haven::write_sav(duration_data, path = x <- tempfile(fileext = ".sav"))

haven::read_sav(x)
#> # A tibble: 1 × 1
#>   duration
#>      <dbl>
#> 1        0

Created on 2024-06-20 with reprex v2.1.0

@hsbizahlenzauber
Copy link
Author

@chainsawriot Many thanks for taking care of this and for the tip. I have to admit that I'm not really familiar with the details of different data types. I report when I notice something and I'm very glad that there are people who deal with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants