13  tuners

By default, models fit on confidential data do not use additional hyperparameter tuning. This section describes how users can specify hyperparameter tuning schemes for their models using cross-validation.

13.1 tuner specifications

Each element passed to tuner is a named list with three required elements:

  • v: the number of cross-validation folds.
  • grid: either a data.frame of tuning combinations or a positive integer of values to be created automatically. This value is passed to tune::tune_grid()
  • metric: a library(yardstick) metric function or metric_set used to select an optimal hyperparameter.

Here’s an example:

# example tuner named list
example_tuner <- list(
  v = 10, # 10-fold cross validation
  grid = 5, # 5 candidate hyperparameter values selected by library(yardstick)
  metric = yardstick::rmse # root-mean-square error 
)

When tuner objects are specified, the following steps occur:

  1. For each cross-validation fold (v) and element of the hyperparameter grid (grid), tidysynthesis fits a model and calculates the metric (metric).
  2. The metric results are averaged over cross-validation folds and the optimal model is selected.
  3. The optimal model is then used to generate samples.
Note

As of version 0.1.0, hyperparameter optimization in tidysynthesis exclusively occurs at the confidential model fitting stage. Optimal hyperparameters will depend on confidential data, which may increase disclosure risks. When fitting models that may be sensitive to specific hyperparameter settings, consider determining hyperparameters based on public datasets.

13.2 Using tuners with model specifications

tidymodels provides the infrastructure for hyperparameter tuning through the library(tune) package. To do so, we first specify a model where specific hyperparameters are replaced with tune() objects.

library(tidymodels)
library(tidysynthesis)

# specify a model with a tunable parameter
rpart_mod <- parsnip::decision_tree(cost_complexity = tune::tune()) %>%
  parsnip::set_engine(engine = "rpart") %>%
  parsnip::set_mode(mode = "regression")