4 roadmap
As discussed in the minimal example, roadmap
objects describe the input and output data, its properties, and the macroscropic strategies for synthesizing your data. This section outlines the various functionality available in roadmap
.
4.1 roadmap
Arguments
All syntheses start with a roadmap
object, which is a container with information about the order of operations for a synthesis. roadmap()
creates a roadmap
S3 object and contains many arguments for modifying its behavior.
As a reminder, roadmap
’s constructor requires two inputs:
conf_data
: A data frame with the confidential data used to generate the synthetic data. The resulting synthetic data will have the same number of columns asconf_data
.start_data
: A data frame with a strict subset of variables fromconf_data
, which is used to start the synthesis process. The resulting synthetic data will have the same number of rows asstart_data
.
roadmap
also offers the following optional arguments:
start_method
: An object that is executed prior to running a synthesis.start_method
objects modifie thestart_data
, typically randomly, to provide greater disclosure risk protections. By default, thestart_data
is used unmodified in the final synthetic data output.schema
: An object that handles data type information about each column in the confidential data.schema
objects can also modify data types, missing data definitions, and factor level definitions. By default, theschema
is inferred fromconf_data
.visit_sequence
: An object that specifies the order of synthesis for a sequential synthesis.visit_sequence
objects can be specified manually or data-driven. By default, thevisit_sequence
uses the order that variables appear inconf_data
.replicates
: An object that controls strategy and frequency for multiple synthesis.tidysynthesis
lets you generate multiple replicates of the start data, conditional syntheses, and/or the entire end-to-end synthesis process. By default, thereplicates
object only creates one synthetic dataset with the same number of rows asstart_data
and the same number of columns asconf_data
.constraints
: An object that defines constraints for the synthetic data and strategies for enforcing constraints during the synthesis process. These constraints can limit numeric variables to specific ranges or define allowed or forbidden levels for factor variables. By default, noconstraints
are implemented.
4.2 roadmap
Tidy API
The required arguments for roadmap
, conf_data
and start_data
, cannot be modified without creating a new roadmap
instance. However, the remaining arguments can be updated using the provided Tidy API.
We recommend reading the individual “roadmap
Components” documentation pages to learn best practices for using the Tidy API.
The Tidy API provides the following functions:
add_*(roadmap, ...)
functions can be used to add components to aroadmap
:add_start_method(roadmap, start_method)
add_schema(roadmap, schema)
add_visit_sequence(roadmap, visit_sequence)
add_replicates(roadmap, replicates)
add_constraints(roadmap, constraints)
update_*(roadmap, ...)
functions can be used to modify component arguments in aroadmap
:update_start_method(roadmap, ...)
update_schema(roadmap, ...)
update_replicates(roadmap, ...)
update_constraints(roadmap, ...)
reset_*(roadmap)
functions can be used to reset component arguments to their default values in aroadmap
:reset_start_method(roadmap)
reset_schema(roadmap)
reset_visit_sequence(roadmap)
reset_replicates(roadmap)
reset_constraints(roadmap)