Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function contr.deviation() #444

Merged
merged 12 commits into from
Aug 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ export(coef_var)
export(coerce_to_numeric)
export(colnames_to_row)
export(column_as_rownames)
export(contr.deviation)
export(convert_na_to)
export(convert_to_na)
export(data_addprefix)
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

NEW FUNCTIONS

* `contr.deviation()` for sum-deviation contrast coding of factors.

* `rowmean_n()`, to compute row means if row contains at least `n` non-missing
values.

Expand Down
98 changes: 98 additions & 0 deletions R/contrs.R
mattansb marked this conversation as resolved.
Show resolved Hide resolved
mattansb marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#' Deviation Contrast Matrix
#'
#' Build a deviation contrast matrix, a type of _effects contrast_ matrix.
#'
#' @inheritParams stats::contr.sum
#'
#' @details
#' In effects coding, unlike treatment/dummy coding ([stats::contr.treatment()]), each
#' contrast sums to 0. In regressions models, this results in an intercept that
#' represents the (unweighted) average of the group means. In ANOVA settings,
#' this also guarantees that lower order effects represent _main_ effects (and
#' not _simple_ or _conditional_ effects, as is the case when using R's default
#' [stats::contr.treatment()]).
#' \cr\cr
#' Deviation coding (`contr.deviation`) is a type of effects coding.
#' With deviation coding, the coefficients for factor variables are interpreted
#' as the difference of each factor level from the base level
#' (this is the same interpretation as with treatment/dummy coding).
#' For example, for a factor `group` with levels "A", "B", and "C", with `contr.devation`,
#' the intercept represents the overall mean (average of the group means for the 3 groups),
#' and the coefficients `groupB` and `groupC` represent the differences between
#' the A group mean and the B and C group means, respectively.
#' \cr\cr
#' Sum coding ([stats::contr.sum()]) is another type of effects coding.
#' With sum coding, the coefficients for factor variables are interpreted
#' as the difference of each factor level from **the grand (across-groups) mean**.
#' For example, for a factor `group` with levels "A", "B", and "C", with `contr.sum`,
#' the intercept represents the overall mean (average of the group means for the 3 groups),
#' and the coefficients `group1` and `group2` represent the differences the
#' **A** and **B** group means from the overall mean, respectively.
#'
#' @seealso [stats::contr.sum()]
#'
#' @examples
#' \dontrun{
mattansb marked this conversation as resolved.
Show resolved Hide resolved
#' data("mtcars")
#'
#' mtcars <- data_modify(mtcars, cyl = factor(cyl))
#'
#' c.treatment <- cbind(Intercept = 1, contrasts(mtcars$cyl))
#' solve(c.treatment)
#' #> 4 6 8
#' #> Intercept 1 0 0 # mean of the 1st level
#' #> 6 -1 1 0 # 2nd level - 1st level
#' #> 8 -1 0 1 # 3rd level - 1st level
#'
#' contrasts(mtcars$cyl) <- contr.sum
#' c.sum <- cbind(Intercept = 1, contrasts(mtcars$cyl))
#' solve(c.sum)
#' #> 4 6 8
#' #> Intercept 0.333 0.333 0.333 # overall mean
#' #> 0.667 -0.333 -0.333 # deviation of 1st from overall mean
#' #> -0.333 0.667 -0.333 # deviation of 2nd from overall mean
#'
#'
#' contrasts(mtcars$cyl) <- contr.deviation
#' c.deviation <- cbind(Intercept = 1, contrasts(mtcars$cyl))
#' solve(c.deviation)
#' #> 4 6 8
#' #> Intercept 0.333 0.333 0.333 # overall mean
#' #> 6 -1.000 1.000 0.000 # 2nd level - 1st level
#' #> 8 -1.000 0.000 1.000 # 3rd level - 1st level
#'
#' ## With Interactions -----------------------------------------
#' mtcars <- data_modify(mtcars, am = factor(am))
#' mtcars <- data_arrange(mtcars, select = c("cyl", "am"))
#'
#' mm <- unique(model.matrix(~ cyl * am, data = mtcars))
#' rownames(mm) <- c(
#' "cyl4.am0", "cyl4.am1", "cyl6.am0",
#' "cyl6.am1", "cyl8.am0", "cyl8.am1"
#' )
#'
#' solve(mm)
#' #> cyl4.am0 cyl4.am1 cyl6.am0 cyl6.am1 cyl8.am0 cyl8.am1
#' #> (Intercept) 0.167 0.167 0.167 0.167 0.167 0.167 # overall mean
#' #> cyl6 -0.500 -0.500 0.500 0.500 0.000 0.000 # cyl MAIN eff: 2nd - 1st
#' #> cyl8 -0.500 -0.500 0.000 0.000 0.500 0.500 # cyl MAIN eff: 2nd - 1st
#' #> am1 -0.333 0.333 -0.333 0.333 -0.333 0.333 # am MAIN eff
#' #> cyl6:am1 1.000 -1.000 -1.000 1.000 0.000 0.000
#' #> cyl8:am1 1.000 -1.000 0.000 0.000 -1.000 1.000
#' }
#'
#' @export
contr.deviation <- function(n, base = 1,

Check warning on line 85 in R/contrs.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=R/contrs.R,line=85,col=1,[object_name_linter] Variable and function name style should match snake_case or symbols.

Check warning on line 85 in R/contrs.R

View workflow job for this annotation

GitHub Actions / lint / lint

file=R/contrs.R,line=85,col=1,[object_name_linter] Variable and function name style should match snake_case or symbols.
etiennebacher marked this conversation as resolved.
Show resolved Hide resolved
contrasts = TRUE,
sparse = FALSE) {
cont <- stats::contr.treatment(n,
base = base,
contrasts = contrasts,
sparse = sparse
)
if (contrasts) {
n <- nrow(cont)
cont <- cont - 1 / n
}
cont
}
6 changes: 5 additions & 1 deletion _pkgdown.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ reference:
- data_duplicated
- data_unique

- title: Statistical Transformations
- title: Data and Variable Transformations
- subtitle: Statistical Transformations
desc: |
Functions for transforming variables
contents:
Expand Down Expand Up @@ -51,6 +52,9 @@ reference:
- normalize
- unstandardize
- makepredictcall.dw_transformer
- subtitle: "Others"
contents:
- contr.deviation
mattansb marked this conversation as resolved.
Show resolved Hide resolved

- title: Data Properties
desc: |
Expand Down
103 changes: 103 additions & 0 deletions man/contr.deviation.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 23 additions & 0 deletions tests/testthat/_snaps/contr.deviation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# contr.deviation | snapshot

Code
solve(c.deviation)
Output
4 6 8
Intercept 0.3333333 0.3333333 0.3333333
6 -1.0000000 1.0000000 0.0000000
8 -1.0000000 0.0000000 1.0000000

---

Code
solve(mm)
Output
cyl4.am0 cyl4.am1 cyl6.am0 cyl6.am1 cyl8.am0 cyl8.am1
(Intercept) 0.3333333 0.0000000 0.3333333 0.0000000 0.3333333 0.0000000
cyl6 -1.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000
cyl8 -1.0000000 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000
am1 -0.3333333 0.3333333 -0.3333333 0.3333333 -0.3333333 0.3333333
cyl6:am1 1.0000000 -1.0000000 -1.0000000 1.0000000 0.0000000 0.0000000
cyl8:am1 1.0000000 -1.0000000 0.0000000 0.0000000 -1.0000000 1.0000000

28 changes: 28 additions & 0 deletions tests/testthat/test-contr.deviation.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
test_that("contr.deviation", {
c.treatment <- solve(cbind(Intercept = 1, contr.treatment(3)))
c.sum <- solve(cbind(Intercept = 1, contr.sum(3)))
c.deviation <- solve(cbind(Intercept = 1, contr.deviation(3)))

expect_equal(c.deviation[1, ], c.sum[1, ])

Check warning on line 6 in tests/testthat/test-contr.deviation.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-contr.deviation.R,line=6,col=3,[expect_identical_linter] Use expect_identical(x, y) by default; resort to expect_equal() only when needed, e.g. when setting ignore_attr= or tolerance=.
expect_equal(c.deviation[-1, ], c.treatment[-1, ])

Check warning on line 7 in tests/testthat/test-contr.deviation.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-contr.deviation.R,line=7,col=3,[expect_identical_linter] Use expect_identical(x, y) by default; resort to expect_equal() only when needed, e.g. when setting ignore_attr= or tolerance=.
})

test_that("contr.deviation | snapshot", {
skip_if_not_installed("base", "4.3")
# IF THIS TESTS FAILS, UPDATE THE EXAMPLE

data("mtcars")
mtcars <- data_modify(mtcars, cyl = factor(cyl))
mtcars <- data_modify(mtcars, am = factor(am))
mtcars <- data_arrange(mtcars, select = c("cyl", "am"))

contrasts(mtcars$cyl) <- contr.deviation
c.deviation <- cbind(Intercept = 1, contrasts(mtcars$cyl))
expect_snapshot(solve(c.deviation))

mm <- unique(model.matrix(~ cyl * am, data = mtcars))
rownames(mm) <- c("cyl4.am0", "cyl4.am1", "cyl6.am0",
"cyl6.am1", "cyl8.am0", "cyl8.am1")

expect_snapshot(solve(mm))
})
Loading