14 noise
tidysynthesis
sampler
elements dictate how new synthetic records are drawn from models. noise
elements dictate how additional randomization may optionally be injected beyond the existing randomness due to sampling new records.
14.1 noise
components
The function noise()
creates a noise
S3 object with the following properties:
add_noise
: required Boolean, eitherTRUE
orFALSE
mode
: required string, eitherregression
orclassification
noise_func
: a function that applies the noise transformation.
Each noise_func
has the following arguments:
model
: a model fit object fromparsnip
new_data
: adata.frame
containing the working synthetic data.conf_model_data
: adata.frame
containing the confidential data used for modeling purposes.outcome_var
: a string variable name for the new variable.col_schema
: acol_schema
named list element matching the schema.pred
: a vector of predicted values from the sampling application.
Additional arguments provided to noise()
will be passed to noise_func
.
14.2 noise_func
examples provided by tidysynthesis
tidysynthesis
provides the following example noise functions.
add_noise_gaussian
: add independent Gaussian noise with mean zero and constant variance to numeric variables.add_noise_laplace
: add independent Laplace noise with mean zero and constant variance to numeric variables.add_noise_cat_unif
: add independent uniform noise from a categorical distribution to categorical variables.add_noise_kde
: add dependent Gaussian noise with mean zero and variance estimated from a kernel density estimate bandwidth for numeric variables.