ACS/Census Data

Reminder

Please use all code samples responsibly - these are samples and likely require adjustments to work correctly for your specific needs. Read through the documentation and comments to understand any caveats or limitations of the code and/or data and follow-up with the code author or Code Library admins if you have questions on how to adapt the sample to your specific use case.

ACS/Census Data

Purpose: This script pulls data from the Cenus API. Specify the geography, year, and variables of interest. For a more in-depth resource on the tidycensus package see: An Introduction to tidycensus. If you are mapping this data, see the geographic crosswalk tutorial to understand how different census geographies relate to one another.

Data: This script uses the Census API to give you access to any census or demographic data that you want.

Author: Amy Rogin (May 2023)

## Set-up
#### Load libraries and functions
library(tidyverse)
library(tidycensus)


#you will need a census api key, apply here: https://api.census.gov/data/key_signup.html
census_api_key("your_key", install = TRUE, overwrite = TRUE)

# get a list of all of the acs variables
vars <- load_variables(2020, "acs5") # specify the year we want and the survey (acs-5 year)

## TRANSPORTATION - COMMUTE TIME 
# select the total number of workers 16 years or older who did not work at home
travel_time <- 
  get_acs(geography = "tract", # you can change this to other geographies
          variables = "B08303_001", # this can also be a list of multiple variables
          state = "IL", # this can also be a list of states 
          year = 2018, # the call defaults to the 5-year ACS so this is actually 2014-2018
          geometry=FALSE) %>% 
  mutate(pop= estimate) %>% 
  select(GEOID, NAME, pop, moe)

Loop over years

Caution

Two things to be careful of if you are looping over years of ACS/Census data are:

1) Changes in geography: geographic boundaries can change between years (and particularly between decennial censuses). See the geographic crosswalk tutorial for more information.

2) Changes in variable IDs: variable IDs can also change between years, so make sure to look up the appropriate ID

Purpose: Unfortunately you can’t supply a list of years to the get_acs() function so you need to loop over them with a function.

Data: Compile multiple years of ACS data into one data frame

#### Load libraries and functions
library(tidyverse)
library(tidycensus)
# study years to iterate over 
years <- c(2015, 2016, 2017, 2018, 2019, 2020)

# STATE POUPULATION
state_pop <- function(years){
  get_acs(geography = "state", # you can change this to other geographies
          year = years, # the call defaults to the 5-year ACS 
          variable = "B01003_001",# this can also be a list of multiple variables
          geometry = TRUE,
          progress = FALSE) %>% 
        # add a column with the year the data came from 
    mutate(year = years)
}

# use map_df to loop over the years and combine into one data frame
pop_df <- map_df(years, state_pop)