12  recipes and steps

tidysynthesis leverages recipes from the tidymodels framework to handle data preprocessing prior to modeling. The construction of

12.1 recipe Creation and Specifying Custom Steps

When a presynth object is created, a list of recipes is also automatically created. Here’s what the list looks like for our ACS example:

acs_roadmap <- tidysynthesis::roadmap(
  conf_data = tidysynthesis::acs_conf_nw, 
  start_data = tidysynthesis::acs_start_nw
)

rpart_mod <- parsnip::decision_tree() |>
  parsnip::set_engine(engine = "rpart") |>
  parsnip::set_mode(mode = "regression")

rpart_class <- parsnip::decision_tree() |>
  parsnip::set_engine(engine = "rpart") |>
  parsnip::set_mode(mode = "classification")

# create a basic synth_spec 
acs_synth_spec <- tidysynthesis::synth_spec(
  # use previously defined parsnip models
  default_regression_model = rpart_mod,
  default_classification_model = rpart_class,
  # use tidysynthesis-provided sampler functions
  default_regression_sampler = tidysynthesis::sample_rpart,
  default_classification_sampler = tidysynthesis::sample_rpart
)

acs_presynth <- tidysynthesis::presynth(
  roadmap = acs_roadmap,
  synth_spec = acs_synth_spec
)
Warning in construct_noise(roadmap = roadmap, default_regression_noise =
synth_spec[["default_regression_noise"]], : No noise specified, using default
noise() object.
Warning in construct_tuners(roadmap = roadmap, default_regression_tuner =
synth_spec[["default_regression_tuner"]], : No tuners specified, using default
tuner
Warning in construct_extractors(roadmap = roadmap, default_extractor =
synth_spec[["default_extractor"]], : No extractors specified, using default
extractor.
Some variable(s) have no non-default visit sequence method specified: TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE
print(acs_presynth$workflows$built_recipes)
$hcovany
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:   1
predictor: 4

$empstat
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:   1
predictor: 5

$classwkr
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:   1
predictor: 6

$age
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:   1
predictor: 7

$famsize
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:   1
predictor: 8

$transit_time
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:   1
predictor: 9

$inctot_NA
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:    1
predictor: 10

$inctot
── Recipe ──────────────────────────────────────────────────────────────────────
── Inputs 
Number of variables by role
outcome:    1
predictor: 11

As expected, the default recipe objects automatically handle the availability of start_data variables and previously synthesized variables.

These default recipe objects can be modified using functions of the form recipes::step_* from library(recipes). To specify specific steps for particular conditional synthesis steps, users can define functions that apply these steps and pass these new functions as arguments to synth_spec. Here’s an example:

#' 
#' Example recipe-modifying step for centering + scaling all numeric predictors
#' 
#' @param rec recipes::recipe
#' @returns recipes::recipe 
#' 
step1 <- function(rec) {
  
  return(
    rec %>%
      recipes::step_center(recipes::all_predictors(), 
                           -tidyselect::where(is.factor)) %>%
      recipes::step_scale(recipes::all_predictors(),
                          -tidyselect::where(is.factor))
  ) 
}

Some recipe transformations are invertible. The synth_spec argument invert_transformations determines whether or not transformations are inverted upon use.