## Load libraries and functions
library(tidyverse)
library(tidycensus)
library(survey)
library(srvyr)
library(stringr)
library(ipumsr)
# load in all needed variables for DC, VA, MD ACS 5 year 2016-2020 with tidyverse: https://walker-data.com/tidycensus/articles/pums-data.html#pums-data-dictionaries
# PUMA 2010 reference map: https://www.census.gov/geographies/reference-maps/2010/geo/2010-pumas.html (2016-2020 use 2010 pumas)
<- pums_variables %>%
pums_vars_2020 filter(year == 2020, survey == "acs5") %>%
#data dictionary for 16-20 microdata has labels to explain variable codes
distinct(var_code, var_label, data_type, level)
# Call the PUMS API
get_pums(
variables = c("PUMA","TEN","TYPEHUGQ","SPORDER","VALP","BDSP","BLD","NP","HINCP","VACS","GRNTP","RNTP","RAC1P","HISP","GRPIP","OCPIP","JWMNP","WAGP","DREM","DPHY","DOUT","DDRS","DEYE","DEAR"), # list of varibales to pull
state="DC", # specify the state
survey="acs5",
year = 2020, # specify the year
show_call=TRUE,
recode = TRUE) %>%
mutate(jur="District of Columbia") # rename to District of Columbia
IPUMS Data
Please use all code samples responsibly - these are samples and likely require adjustments to work correctly for your specific needs. Read through the documentation and comments to understand any caveats or limitations of the code and/or data and follow-up with the code author or Code Library admins (code_library@urban.org) if you have questions on how to adapt the sample to your specific use case.
IPUMS API
Purpose: The IPUMS API allows for access to census micro-data at the PUMA level (Public Use Micro-data Area). Micro-data allows you to create more detailed cross-tabulations than just general pre-calculated census data. For example, you can access data on access to transportation by race (where in the general census data you could see access to transportation, and race, but not the dis-aggregated data together).
Data: Public-use microdata sample data. This is unit-level data from complex census surveys (like the ACS) that allow users to generate custom tables and estimates that aren’t available in pre-tabulated tables (like what you can pull using the tidycensus
package). The most granular geography available for this data is at the PUMA level (public-use microdata areas). For more information on crosswalking this geography to other census geographies see the geographic crosswalk tutorial.
Author: Amy Rogin (May 2024)
This code sample is for using the IPUMS API but does not currently include code for analysis using the data. IPUMS has complex survey micro-data (unit-level data) and requires the use of survey weights for aggregation. Reach out to the R users group slack channel or statistical methods user group for analysis questions.