Combine variables — prep_combine

A wrapper around tidyr::unite() which pastes several columns into one. In addition it checks the output is identical to dplyr::coalesce(). If not identical, the input data.frame is returned unchanged. Useful for uniting sparsely populated columns, for example when processing an ard that was created with cards::ard_stack() then shuffled with [shuffle_card()].

If the data is the result of a hierarchical ard stack (with cards::ard_stack_hierarchical() or cards::ard_stack_hierarchical_count()), the input is returned unchanged. This is assessed from the information in the context column which needs to be present. If the input data does not have a context column, the input will be returned unmodified.

Usage

prep_combine_vars(df, vars, remove = TRUE)

Arguments

df: (data.frame)
vars: (character) a vector of variables to unite. If a single variable is supplied, the input is returned unchanged.
remove: If TRUE, remove input columns from output data frame.

Value

a data.frame with an additional column, called variable_level or the input unchanged.

Examples

df <- data.frame(
  a = 1:6,
  context = rep("categorical", 6),
  b = c("a", rep(NA, 5)),
  c = c(NA, "b", rep(NA, 4)),
  d = c(NA, NA, "c", rep(NA, 3)),
  e = c(NA, NA, NA, "d", rep(NA, 2)),
  f = c(NA, NA, NA, NA, "e", NA),
  g = c(rep(NA, 5), "f")
)

prep_combine_vars(
  df,
  vars = c("b", "c", "d", "e", "f", "g")
)
#>   a     context variable_level
#> 1 1 categorical              a
#> 2 2 categorical              b
#> 3 3 categorical              c
#> 4 4 categorical              d
#> 5 5 categorical              e
#> 6 6 categorical              f