Why Python

Think about Python as a tool that can be used to fill analysis gaps rather than replacing other languages that already have robust infrastructure at Urban (like R and Stata đź‘€)

Python is greate for collecting data (scraping, APIs, other creative sourcing)​, integrating with other tools (e.g., AWS)​, and automating tasks.

To get a sense of why using Python at Urban has looked like, check out these awesome project examples:

Automating Zoning Data Collection​

  • Python can help to automate the data collection pipeline ​

Web scrapping the AARP disparities dashboard​

  • Use python to download online data that’s too involved to do manually​

Collecting, reformatting, and processing LODES employment data

  • Use web scrapping and big data computing power to collect over 75,000 files for the Data Catalog

Over the past decade, Python has made great strides in its data analysis capabilities with machine learning packages like scikit-learn, data visualizations tools like seaborn and plotly, text analysis packages like NLTK, web scraping libraries like Beautiful Soup, and interfaces for working with big data and cloud technology like PySpark and boto3.

Packages such as reticulate and rpy2 have also made using Python alongside R much easier, allowing users to benefit from the comparative advantages of the two languages together.

Python’s multiple IDE options, like Spyder, PyCharm, or Juypter Notebook, also give users flexibility in how they develop and share their work with others.

Slides and a recording and from a previous Python Users Group session discussing use cases for Python at Urban are available here.