st_read()
and st_write()
to read/write out spatial filestigris
and tidycensus
packages to easily obtain spatial dataStep 1: If you haven’t in past weeks, create a new directory (folder) called mapping
Step 2: Open RStudio. Click “Project: (None)” in the top right corner. Click “New Project” and create a project based on the existing mapping
directory or open your already existing mapping project.
Step 3: Today we will be working with tidyverse, tigris, sf, urbnthemes, and urbnmapr, if you haven’t used tidyverse/tigris/sf submit install.packages("WHATEVER PACKAGE YOU NEED")
to the Console. If you haven’t used urbnmapr or urbnthemes first submit install.packages("devtools")
to the Console, then submit remotes::install_github("UrbanInstitute/urbnmapr")
to the Console and/or remotes::install_github("UrbanInstitute/urbnthemes")
to the Console, depending on what you need to install.
Step 1: Open a new .R
script by clicking File>New File> R Script. Save the script as 03_spatial_operations.R
.
Step 2: Copy and paste the following to the top of 03_spatial_operations.R
. This loads in all the necessary libraries for today. To run the script, highlight it and click control+enter.
library(tidyverse)
library(sf)
library(urbnmapr)
library(urbnthemes)
library(tigris)
set_urbn_defaults(style = "map")
Step 1: Read in the data for fire stations in DC.
fire_stations <- st_read("https://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Safety_WebMercator/MapServer/6/query?where=1%3D1&outFields=NAME,ADDRESS,TRUCK,AMBULANCE&outSR=4326&f=geojson")
## Reading layer `OGRGeoJSON' from data source `https://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/Public_Safety_WebMercator/MapServer/6/query?where=1%3D1&outFields=NAME,ADDRESS,TRUCK,AMBULANCE&outSR=4326&f=geojson' using driver `GeoJSON'
## Simple feature collection with 44 features and 4 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -77.09353 ymin: 38.82054 xmax: -76.9334 ymax: 38.97343
## geographic CRS: WGS 84
Step 2: Use library(tigris)
to load census tracts for DC. Be sure to use cartographic boundaries!
dc_tracts <- tracts(
state = "DC",
year = 2017,
class = "sf",
cb = TRUE
)
Step 3: Map the fire stations on top of the DC census tracts. Make the fire stations red!
ggplot() +
geom_sf(
data = dc_tracts, mapping = aes(),
fill = "grey", color = "black"
) +
geom_sf(
data = fire_stations, mapping = aes(),
color = "red"
)
Spatial joins
In non-spatial joins, like we demonstrated on day 1, you typically join two datasets on a common identifier. In spatial joins, you are instead joining on location.
The important thing to note for spatial joins is that the two spatial datasets that you are joining must be in the same CRS.
Step 1: Check the CRS of the two spatial datasets that we read in.
Step 2: Change the CRS of fire_stations
and dc_tracts
to EPSG code 6487.
dc_tracts <- st_transform(dc_tracts, crs = 6487)
fire_stations <- st_transform(fire_stations, crs = 6487)
st_join()
st_join()
is the primary function for spatial joins of sf
objects. The basic syntax is below:
joined_data <- st_join(x, y, join = st_intersects())
The geometry on the left side of the join is the geometry that will be retained. By default, st_join()
will perform a left join, which means only the observations on the left hand side will be retained. You can turn this feature off with the left = FALSE
argument- this will result in an inner join, where only observations from the left hand side that have matches in the data from the right hand side will be kept.
Step 3: Spatial join the fire stations to the census tract layers. Make sure to maintain the geometries of the census tracts.
dc_stations <- st_join(dc_tracts, fire_stations, join = st_intersects)
Step 4: Compare the number of observations of dc_tracts
and dc_stations
. What happened?
Step 5: Do another spatial join, but this time with the fire_stations on the left side. Plot the results. What changed?
Step 6: Count the number of fire stations per census tract. Map this variable in a choropleth map.
dc_stations %>%
mutate(station = ifelse(is.na(NAME.y),
0,
1
)) %>%
group_by(GEOID) %>%
summarize(stations = sum(station)) %>%
ggplot() +
geom_sf(mapping = aes(fill = factor(stations))) +
scale_fill_discrete() +
labs(fill = "Fire stations per census tract")
When we are working with spatial data, we sometimes want to create buffers. For instance, maybe you want to find all the houses that are within a mile of a bus station. st_buffer()
is a helpful function for creating buffers around points, lines or polygons.
library(units)
is helpful for converting between different units of distance.
Step 1: Play around with units! Assign an object in your global environment for whatever distance you would like.
buffer_distance <- units::set_units(.5, "mile")
Step 2: Convert that distance into meters.
buffer_distance <- buffer_distance %>%
units::set_units("meters")
Step 3: Create a buffer of buffer_distance
around the fire stations.
buffer_stations <- st_buffer(fire_stations, dist = buffer_distance)
Step 4: Plot the results. Hint: use the alpha =
options in geom_sf()
to add transparency and see where the buffers overlap!
Sometimes, we want to take smaller geographies and group them into larger geographies. We can do this using the same dplyr
tools that we use to combine larger data: group_by()
and summarize()
.
Step 1: Use urbnmapr
to load the sf
dataframe of the United States.
states <- get_urbn_map(map = "states", sf = TRUE)
Step 2: Copy and paste the following code into your console. This will attach a census region to each state.
states <- states %>%
mutate(region = case_when(
state_abbv %in% c("ME", "RI", "CT", "NY", "NJ", "PA", "VT", "NH", "MA") ~ "Northeast",
state_abbv %in% c(
"TX", "LA", "OK", "AR", "TN", "AL", "MS", "KY", "WV",
"MD", "DC", "VA", "NC", "SC", "GA", "FL", "DE"
) ~ "South",
state_abbv %in% c(
"ND", "SD", "NE", "KS", "MN", "IA", "MO", "WI", "IL",
"IN", "MI", "OH"
) ~ "Midwest",
state_abbv %in% c(
"CA", "OR", "WA", "ID", "MT", "WY", "UT", "CO", "AZ",
"NM", "AK", "HI", "NV"
) ~ "West"
))
Step 3: Use group_by()
and summarize()
to aggregate the state geographies into census regions.
regions <- states %>%
group_by(region) %>%
summarize()
ggplot() +
geom_sf(data = regions, mapping = aes(fill = region))