Design Philosophy
design-philosophy.Rmdurbnindicators makes a number of opinionated design choices. “Opinionated” doesn’t mean that these decisions are the best ones for every user or use-case, but these decisions are designed to either speed or improve the accuracy of common workflows.
Design choices
Support geographies from the tract level and up. Block groups are not supported because the margins of error for block group-level estimates are often so large as to make the estimates meaningless. Further, many estimates available from the tract level and up are not available for block groups.
Support five-year estimates only. One-year estimates bring margin of error challenges, even for relatively larger-population geographies, such as tracts, zip codes, and some places and counties.
Rename all variables. The default variable names returned by the API are not human-friendly. Not only is it challenging to determine what a given variable represents when you’re looking at a name like
B01001_001E, but when you’re looking at a dozen or a hundred such variables, it’s very easy to accidentally misinterpret or mis-select the variable(s) you want. For these reasons, we apply more meaningful names to every returned variable while retaining consistency of variable names from within the same table so that it’s easy to select and operate on sets of interrelated variables. The downside of this approach is that the default API variable names are used in other publications, and that you will find no documentation anywhere (apart from the codebook returned by this package!) of a variable named, for example,race_personofcolor_percent. Variables in the codebook have their original API names included in their definitions so that you can cross-reference these as needed.Use a consistent variable naming convention. Variable names follow the pattern
[concept]_[subconcept]_[characteristic]_[metric]. For example,race_nonhispanic_white_alone_percent. The_percentsuffix always denotes a derived percentage,_universedenotes the denominator used to calculate percentages for that table, and_Mdenotes a margin of error. This consistency makes it easy to usedplyr::matches()ordplyr::starts_with()to select related groups of variables.Express percentages on a 0–1 scale. All derived percentages are expressed as proportions (e.g., 0.25 rather than 25). This avoids ambiguity and simplifies downstream calculations (e.g., multiplying a proportion by a population count). Use
scales::percent()for display formatting. You can always just multiply these values (and the MOEs) by 100 if you prefer; this multiplication requires no other adjustments to the MOEs.Always propagate margins of error. When
urbnindicatorsderives a new variable from two or more raw ACS estimates, it also calculates a margin of error for that derived variable using Census Bureau-recommended formulae. This means every_percentvariable in the output has a corresponding_percent_Mvariable. These derived MOEs have known limitations (seevignette("quantified-survey-error")) but are far preferable to dropping error information entirely.