I want to filter a dataframe on the values in multiple columns, without needing to hard code the columns and values inside the dplyr::filter
call. Essentially, I want to avoid this:
df_in <- data.frame(
a = c("first", "first", "first", "first", "last", "last", "last", "last"),
b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"),
c = 1:8
)
df_in
df_out <- df_in %>%
dplyr::filter(
!grepl("a", a), b == "second", c < 5 ## I want to avoid burying this in my code
)
df_out
I want to do something like this, with an imaginary prep_function
and eval_function
:
filt_crit <- prep_function(!grepl("a", a), b == "second", c < 5)
df_out <- df_in %>% dplyr::filter(eval_function(filt_crit))
df_out
I can use rlang::expr
to filter based on one criterion:
filt_crit1 <- rlang::expr(!grepl("a", a))
df_partial <- df_in %>% dplyr::filter(eval(filt_crit1))
df_partial
I've figured out a way to do this with purrr::reduce(dplyr::filter(...))
, iterating over filt_crit
:
filt_crit <- c(rlang::expr(!grepl("a", a)), rlang::expr(b == "second"), rlang::expr(c < 5))
df_out <- filt_crit %>%
purrr::reduce(\(acc, nxt) dplyr::filter(acc, eval(nxt)), .init = df_in)
df_out
This seems a bit clunky. Is purrr::reduce
the most straightforward solution? Thanks!
I want to filter a dataframe on the values in multiple columns, without needing to hard code the columns and values inside the dplyr::filter
call. Essentially, I want to avoid this:
df_in <- data.frame(
a = c("first", "first", "first", "first", "last", "last", "last", "last"),
b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"),
c = 1:8
)
df_in
df_out <- df_in %>%
dplyr::filter(
!grepl("a", a), b == "second", c < 5 ## I want to avoid burying this in my code
)
df_out
I want to do something like this, with an imaginary prep_function
and eval_function
:
filt_crit <- prep_function(!grepl("a", a), b == "second", c < 5)
df_out <- df_in %>% dplyr::filter(eval_function(filt_crit))
df_out
I can use rlang::expr
to filter based on one criterion:
filt_crit1 <- rlang::expr(!grepl("a", a))
df_partial <- df_in %>% dplyr::filter(eval(filt_crit1))
df_partial
I've figured out a way to do this with purrr::reduce(dplyr::filter(...))
, iterating over filt_crit
:
filt_crit <- c(rlang::expr(!grepl("a", a)), rlang::expr(b == "second"), rlang::expr(c < 5))
df_out <- filt_crit %>%
purrr::reduce(\(acc, nxt) dplyr::filter(acc, eval(nxt)), .init = df_in)
df_out
This seems a bit clunky. Is purrr::reduce
the most straightforward solution? Thanks!
2 Answers
Reset to default 3You can achieve your desired result by wrapping your filter conditions inside rlang::exprs
to create a list of expressions, then pass the conditons to dplyr::filter
using the unsplice operatior !!!
:
df_in <- data.frame(
a = c("first", "first", "first", "first", "last", "last", "last", "last"),
b = c("second", "second", "loser", "loser", "second", "second", "loser", "loser"),
c = 1:8
)
library(dplyr, warn = FALSE)
.filt_crit <- rlang::exprs(!grepl("a", a), b == "second", c < 5)
df_in |> filter(!!!.filt_crit)
#> a b c
#> 1 first second 1
#> 2 first second 2
.filt_crit <- rlang::exprs(!grepl("a", a), c < 5)
df_in |> filter(!!!.filt_crit)
#> a b c
#> 1 first second 1
#> 2 first second 2
#> 3 first loser 3
#> 4 first loser 4
1) We can use dot dot dot:
library(dplyr)
prep <- function(data, ...) filter(data, ...)
prep(df_in, !grepl("a", a), b == "second", c < 5)
## a b c
## 1 first second 1
## 2 first second 2
These also work and give the same output. The first uses a list of calls and the second uses a character vector. prep
is unchanged from above.
library(dplyr)
library(rlang)
e <- exprs(!grepl("a", a), b == "second", c < 5)
prep(df_in, !!!e)
ch <- c('!grepl("a", a)', 'b == "second"', 'c < 5')
prep(df_in, !!!parse_exprs(ch))
2) Another possible approach is to have two functions: prep
shown above for passing the unquoted expressions and prepObj
whichi s an S3 generic with methods in case you want to pass a single character vector or list of call objects.
prepObj <- function(data, criteria) UseMethod("prepObj", criteria)
prepObj.character <- function(data, criteria) {
filter(data, !!!parse_exprs(criteria))
}
prepObj.list <- function(data, criteria) {
filter(data, !!!criteria)
}
prepObj(df_in, e)
prepObj(df_in, ch)
3) This version works with all three cases: unquoted arguments, character vector and list of calls. It first tries to run it as unquoted arguments but if that fails it checks which of the other two cases apply and runs it appropriately.
library(dplyr)
library(rlang)
prep2 <- function(data, ...) {
y <- try(filter(data, ...), silent = TRUE)
if (inherits(y, "try-error")) {
if (is.character(..1)) filter(data, !!!parse_exprs(..1))
else if (is.list(..1)) filter(data, !!!..1)
else stop("invalid argument")
} else y
}
prep2(df_in, !grepl("a", a), b == "second", c < 5)
prep2(df_in, exprs(!grepl("a", a), b == "second", c < 5))
prep2(df_in, c('!grepl("a", a)', 'b == "second"', 'c < 5'))