library(tidyverse)
library(readxl)
# file path to excel file
<- file.path("sample_excel_file.xlsx")
path
# This uses readxl and purrr to loop through each sheet in a workbook and build a dataframe (Note you can define which cell range to extract).
<- path %>%
xls_data # this gets each of the sheets in the excel file
excel_sheets() %>%
set_names() %>%
# this loops through each excel sheet and appends them together into one df
map_df(~ read_excel(path = path, col_types="text", sheet = .x, range = "A1:E30"), .id = "sheet")
Working with Excel Files
Please use all code samples responsibly - these are samples and likely require adjustments to work correctly for your specific needs. Read through the documentation and comments to understand any caveats or limitations of the code and/or data and follow-up with the code author or Code Library admins (code_library@urban.org) if you have questions on how to adapt the sample to your specific use case.
Purpose: Excel files can be tricky to work with, especially if they have data stored in multiple sheets. This code is an example of how to read in a file and merge together each of the sheets into one main dataframe. It also shows how to define the cell range to extract from each sheet.
Data: This example uses a fake excel file that has multiple different sheets.
Author: Aleszu Bajak (February 2024)
Similarly, there may be cases where you have multiple excel files, each with multiple sheets that you want to loop through and combine all into one data frame. You can do that with this code:
# This takes a directory of XLS files and reads each and brings all data together into a dataframe (Note you can define which cell range to extract).
# Downloaded batch of XLSs from countyhealthrankings.org/
<- fs::dir_ls(path = "RWJ_county_health_rankings", glob = "*.xlsx")
RWJ_xlsx_files
# Loop through all Excel files and extract a specific range of cells
<- map_df(set_names(RWJ_xlsx_files), function(file) {
data_xlsx_df %>%
file excel_sheets() %>%
set_names() %>%
map_df(
~ read_xlsx(path = file, sheet = 4, range = "A2:IF300"),
.id = "sheet")
.id = "file") },