Synthetic Data with tidysynthesis
Welcome
This website documents the tidysynthesis
R package that generates synthetic data using a sequence of predictive models. The goal of tidysynthesis
is to provide a process to generate synthetic data based on the tidymodels
framework that safely releases administrative data for some types of research.
This entire set of documentation is a work-in-progress.
License
library(tidysynthesis)
and this documentation are free to use and licensed under the GNU AGPLv3 license.
Acknowledgements
tidysynthesis
was funded by the Alfred P. Sloan Foundation [G-2022-17149] and National Science Foundation National Center for Science and Engineering Statistics [49100422C0008].
Early versions of the package and its codebase were developed in collaboration with the following partners, whose input was instrumental in shaping its design:
- Bureau of Economic Analysis
- Bureau of Justice Statistics
- Department of Human Services in Allegheny County, Pennsylvania
- National Science Foundation National Center for Science and Engineering Statistics
- Nebraska Statewide Workforce & Educational Reporting System
- Statistics of Income Division of the IRS