Review

  • Use library(tigris) to easily import spatial data
  • Use ggplot() and geom_sf() to map multiple layers of data
  • Use st_join to perform spatial joins
  • Use various ggplot2 options to style maps

Exercise 1 and Only

Today, we are going to focus on creating a beautiful, styled map from start to finish.

Step 0: Set Up

Make a new script. Call it whatever you want!

Today we will be working with tidyverse, tigris, sf, urbnthemes, and urbnmapr, if you haven’t used tidyverse/tigris/sf submit install.packages("WHATEVER PACKAGE YOU NEED") to the Console. If you haven’t used urbnmapr or urbnthemes first submit install.packages("devtools") to the Console, then submit remotes::install_github("UrbanInstitute/urbnmapr") to the Console and/or remotes::install_github("UrbanInstitute/urbnthemes") to the Console, depending on what you need to install. Otherwise just load the following code and Use library(urbnthemes) to set your default to "map".

library(tidyverse)
library(urbnthemes)
library(urbnmapr)
library(tigris)
library(sf)

set_urbn_defaults(style = "map")

Step 1: Getting Spatial Data

For this map, we are going multiple different geographic levels: census tracts, counties, and states.

Use library(tigris) to load census tracts for Maryland, Virigina, and DC, and library(urbnmapr) to load states and counties.

dc_tracts <- tigris::tracts(state = "DC",
                            cb = TRUE,
                            class = "sf")
va_tracts <- tigris::tracts(state = "VA",
                            cb = TRUE,
                            class = "sf")
md_tracts <- tigris::tracts(state = "MD",
                            cb = TRUE,
                            class = "sf")

dmv_tracts <- rbind(dc_tracts, va_tracts, md_tracts)

states <- get_urbn_map(map = "states", sf = TRUE) %>% 
  filter(state_abbv %in% c("DC", "MD", "VA"))

counties <- get_urbn_map(map = "counties", sf = TRUE)

Look at each dataframe. Map them if you wish!

Step 2: Bring in non-spatial data

Copy and paste the following code to read in mortgage data at the tract level for several DMV jurisdictions.

tract_data <- read_csv("https://r-training.s3.us-east-2.amazonaws.com/dmv_hmda_data.csv",
                       col_types = cols(census_tract = col_character(),
                                        county_fips = col_character(),
                                        loans = col_integer(),
                                        loan_amount = col_integer(),
                                        borrower_income = col_integer()))

Join the tract data to the tract shapes.

geo_tracts <- left_join(dmv_tracts, tract_data, by = c("GEOID" = "census_tract"))

geo_tracts <- geo_tracts %>% 
  filter(!is.na(loans))

nrow(tract_data) == nrow(geo_tracts)
## [1] TRUE

Step 3: Clean up our spatial data

We have spatial data for all counties, but we only want to map the borders for counties that we have tract-level data for. Use the tract file to filter down the county dataset to just the counties that we want to include.

county_codes <- tract_data %>% 
  pull(county_fips) %>% 
  unique()

dmv_counties <- counties %>% 
  filter(county_fips %in% county_codes)

dmv_counties %>% pull(county_name)
## [1] "Fairfax city"           "Alexandria city"        "Falls Church city"     
## [4] "Fairfax County"         "Montgomery County"      "Arlington County"      
## [7] "District of Columbia"   "Prince George's County"

Map the states and counties. What is wrong?

ggplot() +
  geom_sf(data = states, mapping = aes(),
          fill = "grey", color = "black") +
  geom_sf(data = dmv_counties, mapping = aes(),
          fill = "blue", color = "white")

Reproject all three dataframes to Maryland State Plane.

states <- st_transform(states, crs = 6487)

dmv_counties <- st_transform(dmv_counties, crs = 6487)

geo_tracts <- st_transform(geo_tracts, crs = 6487)

Crop the states to the extent of the county boundaries.

dmv_states <- st_crop(states, dmv_counties)

ggplot() +
  geom_sf(data = dmv_states, mapping = aes(),
          fill = "grey", color = "black") +
  geom_sf(data = dmv_counties, mapping = aes(),
          fill = "blue", color = "white")

Step 4: Begin Mapping

Let’s map all three layers at the same time. Play around with the order of the layers. The goal is to have blue tracts with thin white outlines, thicker outlines for the counties, and a grey background for the states.

ggplot() +
  geom_sf(data = dmv_states, mapping = aes(),
          fill = "grey", color = "black") +
  geom_sf(data = geo_tracts, mapping = aes(),
          fill = "#1696d2", color = "white", size = 0.1) +
    geom_sf(data = dmv_counties, mapping = aes(),
          fill = NA, color = "white", size = 1)

Step 5: Map a variable!

ggplot() +
  geom_sf(data = dmv_states, mapping = aes(),
          fill = "#ececec", color = "#ec008b") +
  geom_sf(data = geo_tracts, mapping = aes(fill = borrower_income),
          color = "white", size = 0.1) +
  scale_fill_gradientn(na.value = "#9d9d9d",
                       labels = scales::dollar) +
  geom_sf(data = dmv_counties, mapping = aes(),
          fill = NA, color = "white", size = 1) +
  labs(fill = "Median borrower income, 2018")

#ggsave("Data/income_dmv_map.png")