最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - How can I skip Excel Files if they do not contain a specific Sheet? - Stack Overflow

programmeradmin0浏览0评论

Here's the code I'm using:

library(readxl) 
setwd(file.path(dirname("~"), "/Shared Documents/Programs/Data and Reporting/Data Quality Reports/Org Level Data"))
lst = list.files(pattern="*.xlsx")
df = data.frame()
for(table in lst){
  dataFromExcel <- read_excel(table, sheet = "Project Summary")
  df <- rbind(df,dataFromExcel)
}
write.csv(df, "_Project Level data.csv")

I basically know nothing about R, and simply mashed together code from a couple sites, editing what little I understood.

Here's the scenario: I have a bunch of Excel files that I download and put into a folder called "Org Level Data". I run this script and it creates a new file with all the data in each file's "Project Summary" sheet. However, it errors out if one of those files does not contain a sheet called "Project Summary", which will be quite a few files. I can get around this by removing those files from the folders, but I'd really like this script to just skip those files and ignore them, if possible.

I saw something about read_excel_safely but I cannot figure out how to insert that into my code, since I understand very little about the "read_excel" and "rbind" sections.

I tried adding , na = "---") to the read_excel() function, thinking that it would handle that not existing, but it'd didn't add or take away anything from the script running.

I've also looked into trying to make lst check for that "Project Summary" sheet when being created, but I don't know how to do that.

Here's the code I'm using:

library(readxl) 
setwd(file.path(dirname("~"), "/Shared Documents/Programs/Data and Reporting/Data Quality Reports/Org Level Data"))
lst = list.files(pattern="*.xlsx")
df = data.frame()
for(table in lst){
  dataFromExcel <- read_excel(table, sheet = "Project Summary")
  df <- rbind(df,dataFromExcel)
}
write.csv(df, "_Project Level data.csv")

I basically know nothing about R, and simply mashed together code from a couple sites, editing what little I understood.

Here's the scenario: I have a bunch of Excel files that I download and put into a folder called "Org Level Data". I run this script and it creates a new file with all the data in each file's "Project Summary" sheet. However, it errors out if one of those files does not contain a sheet called "Project Summary", which will be quite a few files. I can get around this by removing those files from the folders, but I'd really like this script to just skip those files and ignore them, if possible.

I saw something about read_excel_safely but I cannot figure out how to insert that into my code, since I understand very little about the "read_excel" and "rbind" sections.

I tried adding , na = "---") to the read_excel() function, thinking that it would handle that not existing, but it'd didn't add or take away anything from the script running.

I've also looked into trying to make lst check for that "Project Summary" sheet when being created, but I don't know how to do that.

Share Improve this question edited Feb 6 at 17:35 margusl 17.4k3 gold badges19 silver badges28 bronze badges asked Feb 6 at 17:26 AdorableRaspberry953AdorableRaspberry953 11 silver badge1 bronze badge New contributor AdorableRaspberry953 is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
Add a comment  | 

1 Answer 1

Reset to default 1

You can use an if with readxl::excel_sheets to check whether a file contains a sheet with the required name and if not skip that file in your for loop by passing on to the next file, e.g. in the code below the second file is skipped.

Using some fake example data and excel files:

library(readxl)

path <- tempdir()

### Create example files
writexl::write_xlsx(
  list(
    "Project Summary" = mtcars
  ),
  file.path(path, "mtcars.xlsx")
)

writexl::write_xlsx(
  list(
    "Sheet" = iris
  ),
  file.path(path, "iris.xlsx")
)
###

setwd(path)

lst <- list.files(pattern = "*.xlsx")
df <- data.frame()
for (table in lst) {
  if (!"Project Summary" %in% excel_sheets(table)) next
  dataFromExcel <- read_excel(table, sheet = "Project Summary")
  df <- rbind(df, dataFromExcel)
}
#write.csv(df, "_Project Level data.csv")

df
#> # A tibble: 32 × 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # ℹ 22 more rows
发布评论

评论列表(0)

  1. 暂无评论