最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

r - Is there a way to pass a list of filter parameters to `dplyr::filter`? - Stack Overflow

programmeradmin2浏览0评论

I want to filter a dataframe on the values in multiple columns, without needing to hard code the columns and values inside the dplyr::filter call. Essentially, I want to avoid this:

df_in <- data.frame(
  a = c("first", "first", "first", "first", "last", "last", "last", "last"),
  b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"), 
  c = 1:8
)
df_in
df_out <- df_in %>% 
  dplyr::filter(
    !grepl("a", a), b == "second", c < 5   ##  I want to avoid burying this in my code
  )
df_out

I want to do something like this, with an imaginary prep_function and eval_function:

filt_crit <- prep_function(!grepl("a", a), b == "second", c < 5)
df_out <- df_in %>% dplyr::filter(eval_function(filt_crit))
df_out

I can use rlang::expr to filter based on one criterion:

filt_crit1 <- rlang::expr(!grepl("a", a))
df_partial <- df_in %>% dplyr::filter(eval(filt_crit1))
df_partial

I've figured out a way to do this with purrr::reduce(dplyr::filter(...)), iterating over filt_crit:

filt_crit <- c(rlang::expr(!grepl("a", a)), rlang::expr(b == "second"), rlang::expr(c < 5))
df_out <- filt_crit %>% 
  purrr::reduce(\(acc, nxt) dplyr::filter(acc, eval(nxt)), .init = df_in)
df_out

This seems a bit clunky. Is purrr::reduce the most straightforward solution? Thanks!

I want to filter a dataframe on the values in multiple columns, without needing to hard code the columns and values inside the dplyr::filter call. Essentially, I want to avoid this:

df_in <- data.frame(
  a = c("first", "first", "first", "first", "last", "last", "last", "last"),
  b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"), 
  c = 1:8
)
df_in
df_out <- df_in %>% 
  dplyr::filter(
    !grepl("a", a), b == "second", c < 5   ##  I want to avoid burying this in my code
  )
df_out

I want to do something like this, with an imaginary prep_function and eval_function:

filt_crit <- prep_function(!grepl("a", a), b == "second", c < 5)
df_out <- df_in %>% dplyr::filter(eval_function(filt_crit))
df_out

I can use rlang::expr to filter based on one criterion:

filt_crit1 <- rlang::expr(!grepl("a", a))
df_partial <- df_in %>% dplyr::filter(eval(filt_crit1))
df_partial

I've figured out a way to do this with purrr::reduce(dplyr::filter(...)), iterating over filt_crit:

filt_crit <- c(rlang::expr(!grepl("a", a)), rlang::expr(b == "second"), rlang::expr(c < 5))
df_out <- filt_crit %>% 
  purrr::reduce(\(acc, nxt) dplyr::filter(acc, eval(nxt)), .init = df_in)
df_out

This seems a bit clunky. Is purrr::reduce the most straightforward solution? Thanks!

Share Improve this question edited Feb 8 at 5:42 stefan 124k6 gold badges37 silver badges74 bronze badges Recognized by R Language Collective asked Feb 7 at 21:56 JoshJosh 1,33513 silver badges34 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 3

You can achieve your desired result by wrapping your filter conditions inside rlang::exprs to create a list of expressions, then pass the conditons to dplyr::filter using the unsplice operatior !!!:

df_in <- data.frame(
  a = c("first", "first", "first", "first", "last", "last", "last", "last"),
  b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"),
  c = 1:8
)

library(dplyr, warn = FALSE)

.filt_crit <- rlang::exprs(!grepl("a", a), b == "second", c < 5)

df_in |> filter(!!!.filt_crit)
#>       a      b c
#> 1 first second 1
#> 2 first second 2

.filt_crit <- rlang::exprs(!grepl("a", a), c < 5)

df_in |> filter(!!!.filt_crit)
#>       a      b c
#> 1 first second 1
#> 2 first second 2
#> 3 first  loser 3
#> 4 first  loser 4

1) We can use dot dot dot:

library(dplyr)

prep <- function(data, ...) filter(data, ...)
prep(df_in, !grepl("a", a), b == "second", c < 5)

##       a      b c
## 1 first second 1
## 2 first second 2

These also work and give the same output. The first uses a list of calls and the second uses a character vector. prep is unchanged from above.

library(dplyr)
library(rlang)

e <- exprs(!grepl("a", a), b == "second", c < 5)
prep(df_in, !!!e)

ch <- c('!grepl("a", a)', 'b == "second"', 'c < 5')
prep(df_in, !!!parse_exprs(ch))

2) Another possible approach is to have two functions: prep shown above for passing the unquoted expressions and prepObj whichi s an S3 generic with methods in case you want to pass a single character vector or list of call objects.

prepObj <- function(data, criteria) UseMethod("prepObj", criteria)

prepObj.character <- function(data, criteria) {
  filter(data, !!!parse_exprs(criteria))
}

prepObj.list <- function(data, criteria) {
  filter(data, !!!criteria)
}

prepObj(df_in, e)
prepObj(df_in, ch)

3) This version works with all three cases: unquoted arguments, character vector and list of calls. It first tries to run it as unquoted arguments but if that fails it checks which of the other two cases apply and runs it appropriately.

library(dplyr)
library(rlang)

prep2 <- function(data, ...) {
  y <- try(filter(data, ...), silent = TRUE)
  if (inherits(y, "try-error")) {
    if (is.character(..1)) filter(data, !!!parse_exprs(..1))
    else if (is.list(..1)) filter(data, !!!..1)
    else stop("invalid argument")
  } else y
}

prep2(df_in, !grepl("a", a), b == "second", c < 5)
prep2(df_in, exprs(!grepl("a", a), b == "second", c < 5))
prep2(df_in, c('!grepl("a", a)', 'b == "second"', 'c < 5'))
发布评论

评论列表(0)

  1. 暂无评论