Review

  • Use GeoJSON’s whenever possible!
  • Use st_read() and st_write() to read/write out spatial files
  • Be careful about Coordinate Reference Systems!
  • Use the tigris and tidycensus packages to easily obtain spatial data

Exercise 0: Create a Project (if you didn’t do this last week)

Step 1: If you haven’t in past weeks, create a new directory (folder) called mapping

Step 2: Open RStudio. Click “Project: (None)” in the top right corner. Click “New Project” and create a project based on the existing mapping directory or open your already existing mapping project.

Step 3: Today we will be working with tidyverse, tigris, sf, urbnthemes, and urbnmapr, if you haven’t used tidyverse/tigris/sf submit install.packages("WHATEVER PACKAGE YOU NEED") to the Console. If you haven’t used urbnmapr or urbnthemes first submit install.packages("devtools") to the Console, then submit remotes::install_github("UrbanInstitute/urbnmapr") to the Console and/or remotes::install_github("UrbanInstitute/urbnthemes") to the Console, depending on what you need to install.

Exercise 0.5: Setup

Step 1: Open a new .R script by clicking File>New File> R Script. Save the script as 03_spatial_operations.R.

Step 2: Copy and paste the following to the top of 03_spatial_operations.R. This loads in all the necessary libraries for today. To run the script, highlight it and click control+enter.

library(tidyverse)
library(sf)
library(urbnmapr)
library(urbnthemes)
library(tigris)

set_urbn_defaults(style = "map")

Exercise 1: Mapping Review

Step 1: Read in the data for fire stations in DC.

fire_stations <- st_read("https://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Safety_WebMercator/MapServer/6/query?where=1%3D1&outFields=NAME,ADDRESS,TRUCK,AMBULANCE&outSR=4326&f=geojson")
## Reading layer `OGRGeoJSON' from data source `https://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Safety_WebMercator/MapServer/6/query?where=1%3D1&outFields=NAME,ADDRESS,TRUCK,AMBULANCE&outSR=4326&f=geojson' using driver `GeoJSON'
## Simple feature collection with 44 features and 4 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -77.09353 ymin: 38.82054 xmax: -76.9334 ymax: 38.97343
## geographic CRS: WGS 84

Step 2: Use library(tigris) to load census tracts for DC. Be sure to use cartographic boundaries!

dc_tracts <- tracts(
  state = "DC",
  year = 2017,
  class = "sf",
  cb = TRUE
)

Step 3: Map the fire stations on top of the DC census tracts. Make the fire stations red!

ggplot() +
  geom_sf(
    data = dc_tracts, mapping = aes(),
    fill = "grey", color = "black"
  ) +
  geom_sf(
    data = fire_stations, mapping = aes(),
    color = "red"
  )

Spatial Joins

Spatial joins

In non-spatial joins, like we demonstrated on day 1, you typically join two datasets on a common identifier. In spatial joins, you are instead joining on location.

The important thing to note for spatial joins is that the two spatial datasets that you are joining must be in the same CRS.

Exercise 2: Spatial Joins

Step 1: Check the CRS of the two spatial datasets that we read in.

Step 2: Change the CRS of fire_stations and dc_tracts to EPSG code 6487.

dc_tracts <- st_transform(dc_tracts, crs = 6487)

fire_stations <- st_transform(fire_stations, crs = 6487)

st_join()

st_join() is the primary function for spatial joins of sf objects. The basic syntax is below:

joined_data <- st_join(x, y, join = st_intersects())

The geometry on the left side of the join is the geometry that will be retained. By default, st_join() will perform a left join, which means only the observations on the left hand side will be retained. You can turn this feature off with the left = FALSE argument- this will result in an inner join, where only observations from the left hand side that have matches in the data from the right hand side will be kept.

Step 3: Spatial join the fire stations to the census tract layers. Make sure to maintain the geometries of the census tracts.

dc_stations <- st_join(dc_tracts, fire_stations, join = st_intersects)

Step 4: Compare the number of observations of dc_tracts and dc_stations. What happened?

Step 5: Do another spatial join, but this time with the fire_stations on the left side. Plot the results. What changed?

Step 6: Count the number of fire stations per census tract. Map this variable in a choropleth map.

dc_stations %>%
  mutate(station = ifelse(is.na(NAME.y),
    0,
    1
  )) %>%
  group_by(GEOID) %>%
  summarize(stations = sum(station)) %>%
  ggplot() +
  geom_sf(mapping = aes(fill = factor(stations))) +
  scale_fill_discrete() +
  labs(fill = "Fire stations per census tract")

Creating Buffers

When we are working with spatial data, we sometimes want to create buffers. For instance, maybe you want to find all the houses that are within a mile of a bus station. st_buffer() is a helpful function for creating buffers around points, lines or polygons.

library(units) is helpful for converting between different units of distance.

Exercise 3: Buffers

Step 1: Play around with units! Assign an object in your global environment for whatever distance you would like.

buffer_distance <- units::set_units(.5, "mile")

Step 2: Convert that distance into meters.

buffer_distance <- buffer_distance %>%
  units::set_units("meters")

Step 3: Create a buffer of buffer_distance around the fire stations.

buffer_stations <- st_buffer(fire_stations, dist = buffer_distance)

Step 4: Plot the results. Hint: use the alpha = options in geom_sf() to add transparency and see where the buffers overlap!

Aggregation

Sometimes, we want to take smaller geographies and group them into larger geographies. We can do this using the same dplyr tools that we use to combine larger data: group_by() and summarize().

Exercise 4: Aggregation

Step 1: Use urbnmapr to load the sf dataframe of the United States.

states <- get_urbn_map(map = "states", sf = TRUE)

Step 2: Copy and paste the following code into your console. This will attach a census region to each state.

states <- states %>%
  mutate(region = case_when(
    state_abbv %in% c("ME", "RI", "CT", "NY", "NJ", "PA", "VT", "NH", "MA") ~ "Northeast",
    state_abbv %in% c(
      "TX", "LA", "OK", "AR", "TN", "AL", "MS", "KY", "WV",
      "MD", "DC", "VA", "NC", "SC", "GA", "FL", "DE"
    ) ~ "South",
    state_abbv %in% c(
      "ND", "SD", "NE", "KS", "MN", "IA", "MO", "WI", "IL",
      "IN", "MI", "OH"
    ) ~ "Midwest",
    state_abbv %in% c(
      "CA", "OR", "WA", "ID", "MT", "WY", "UT", "CO", "AZ",
      "NM", "AK", "HI", "NV"
    ) ~ "West"
  ))

Step 3: Use group_by() and summarize() to aggregate the state geographies into census regions.

regions <- states %>%
  group_by(region) %>%
  summarize()

ggplot() +
  geom_sf(data = regions, mapping = aes(fill = region))