Census Tract Origin Destination Driving Times, 2017

Download the Data

Complete Dataset (6.2 GB)

Methodology

We use 2010 Census Tract geographies and select all unique combinations of Census tract pairs in the 50 states and District of Columbia that are located within 150 miles of each other by straight line distance, using the centroid of each Census tract for the "address" of that tract, which results in over 154 million pairs. We generate a routing engine from the Open Street Map Routing Machine (OSRM) using the Open Street Map North America data extract as of June 7, 2017, which allows for the calculation of routing for cars, but not for public transportation, which we do not include in this analysis.

Open Street Map is an open source, crowdsourced dataset, which means it is constantly updated by a group of dedicated volunteers, and therefore may have some errors, be dated in certain locations, and contain omissions. We use data from 2017.

We use the routing engine to calculate the fastest driving distance and time with no traffic between each of the over 154 million pairs, using the OSRM /route/v1/driving endpoint. For more information on how these routes are calculated, see OSRM. We speed up the calculation using container technology combined with Apache Spark's parallelized framework. We use the OSRM-backend container combined with the pre-processed routing network, and bootstrap the machines on 50 instances simultaneously using AWS EMR services. Then, we use Apache Spark to queue the millions of calculations, send them out to the machines, and write the results out to cloud storage. Finally, we use additional scripts to recombine the data into single and state-level files for download.

For the full process to create the data yourself, see the Github repository at https://github.com/UI-Research/spark-osrm.

For more information, contact Graham MacDonald at the Urban Institute.

Update

The data represents a snapshot from 2017.