4 roadmap
As discussed in the minimal example, roadmap objects describe the input and output data, its properties, and the macroscropic strategies for synthesizing your data. This section outlines the various functionality available in roadmap.
4.1 roadmap Arguments
All syntheses start with a roadmap object, which is a container with information about the order of operations for a synthesis. roadmap() creates a roadmap S3 object and contains many arguments for modifying its behavior.
As a reminder, roadmap’s constructor requires two inputs:
conf_data: A data frame with the confidential data used to generate the synthetic data. The resulting synthetic data will have the same number of columns asconf_data.start_data: A data frame with a strict subset of variables fromconf_data, which is used to start the synthesis process. The resulting synthetic data will have the same number of rows asstart_data.
roadmap also offers the following optional arguments:
start_method: An object that is executed prior to running a synthesis.start_methodobjects modifie thestart_data, typically randomly, to provide greater disclosure risk protections. By default, thestart_datais used unmodified in the final synthetic data output.schema: An object that handles data type information about each column in the confidential data.schemaobjects can also modify data types, missing data definitions, and factor level definitions. By default, theschemais inferred fromconf_data.visit_sequence: An object that specifies the order of synthesis for a sequential synthesis.visit_sequenceobjects can be specified manually or data-driven. By default, thevisit_sequenceuses the order that variables appear inconf_data.replicates: An object that controls strategy and frequency for multiple synthesis.tidysynthesislets you generate multiple replicates of the start data, conditional syntheses, and/or the entire end-to-end synthesis process. By default, thereplicatesobject only creates one synthetic dataset with the same number of rows asstart_dataand the same number of columns asconf_data.constraints: An object that defines constraints for the synthetic data and strategies for enforcing constraints during the synthesis process. These constraints can limit numeric variables to specific ranges or define allowed or forbidden levels for factor variables. By default, noconstraintsare implemented.
4.2 roadmap Tidy API
The required arguments for roadmap, conf_data and start_data, cannot be modified without creating a new roadmap instance. However, the remaining arguments can be updated using the provided Tidy API.
We recommend reading the individual “roadmap Components” documentation pages to learn best practices for using the Tidy API.
The Tidy API provides the following functions:
add_*(roadmap, ...)functions can be used to add components to aroadmap:add_start_method(roadmap, start_method)add_schema(roadmap, schema)add_visit_sequence(roadmap, visit_sequence)add_replicates(roadmap, replicates)add_constraints(roadmap, constraints)
update_*(roadmap, ...)functions can be used to modify component arguments in aroadmap:update_start_method(roadmap, ...)update_schema(roadmap, ...)update_replicates(roadmap, ...)update_constraints(roadmap, ...)
reset_*(roadmap)functions can be used to reset component arguments to their default values in aroadmap:reset_start_method(roadmap)reset_schema(roadmap)reset_visit_sequence(roadmap)reset_replicates(roadmap)reset_constraints(roadmap)