I have a data set of behavioral data. I want to assign all the different behaviors as "aggressive", "submissive", "affiliative", or leave blank in a column of the data frame.
There are multiple types of each of these behaviors. So for example "fin raise" and "fast approach" are both aggressive behaviors.
I tried this:
if (G14$Behavior == "slow approach" | "fin raise" | "fast approach" | "tail beat" | "ram" | "bite") {
G14$`Behavioral category` <- "aggressive"
} else if (G14$Behavior == "flee" | "avoid" | "tail quiver") {
G14$`Behavioral category` <- "submissive"
} else if (G14$Behavior == "bump" | "join") {
G14$`Behavioral category` <- "affiliative"
} else {
G14$`Behavioral category` <- ""
}
But got this error:
operations are possible only for numeric, logical or complex types
Is there anyway to do this with string characters?
I have a data set of behavioral data. I want to assign all the different behaviors as "aggressive", "submissive", "affiliative", or leave blank in a column of the data frame.
There are multiple types of each of these behaviors. So for example "fin raise" and "fast approach" are both aggressive behaviors.
I tried this:
if (G14$Behavior == "slow approach" | "fin raise" | "fast approach" | "tail beat" | "ram" | "bite") {
G14$`Behavioral category` <- "aggressive"
} else if (G14$Behavior == "flee" | "avoid" | "tail quiver") {
G14$`Behavioral category` <- "submissive"
} else if (G14$Behavior == "bump" | "join") {
G14$`Behavioral category` <- "affiliative"
} else {
G14$`Behavioral category` <- ""
}
But got this error:
operations are possible only for numeric, logical or complex types
Is there anyway to do this with string characters?
Share Improve this question edited 2 days ago Ben Bolker 227k26 gold badges399 silver badges492 bronze badges asked 2 days ago KittKitt 411 silver badge6 bronze badges New contributor Kitt is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 2 |4 Answers
Reset to default 4The answer you provided works, but this would work slightly better:
case_when(Behavior %in% c("slow approach", "fin raise", "fast approach",
"tail beat", "ram", "bite") ~ "aggressive",
Behavior %in% c("flee", "avoid", "tail quiver") ~ "submissive",
...)
(%in%
is base-R, so it will work for people who don't want to use tidyverse; matching against strings is more precise and faster than matching against regular expressions)
I was able to figure it out!! For those who experience the same problems, using the dplyr
and stringr
packages provide the functions case_when
and str_detect
. It would look something like this:
G14 <- G14 %>% mutate(Behavioral.category =(
case_when(
str_detect(Behavior, "slow approach|fin raise|fast approach|bite") ~ "aggressive",
str_detect(Behavior, "flee|avoid|tail quiver") ~ "submissive",
str_detect(Behavior, "bump|join") ~ "affiliative"
)
))
While using %in%
is perhaps the appropriate solution here, you may have searched for grepl
, where you can use such patterns that include '|'
operators. I'd prefer using NA
for non-matches, obviously it's up to you to encode remaining categories differently.
> within(G14, {
+ Behavior_cat <- NA
+ Behavior_cat[
+ grepl("slow approach|fin raise|fast approach|tail beat|ram|bite", Behavior)
+ ] <- "aggressive"
+ Behavior_cat[
+ grepl("flee|avoid|tail quiver", Behavior)
+ ] <- "submissive"
+ Behavior_cat[
+ grepl("bump|join", Behavior)
+ ] <- 'affiliative'
+ })
Behavior Behavior_cat
1 slow approach aggressive
2 fin raise aggressive
3 fast approach aggressive
4 tail beat aggressive
5 ram aggressive
6 bite aggressive
7 flee submissive
8 avoid submissive
9 tail quiver submissive
10 bump affiliative
11 join affiliative
12 random behavior <NA>
Here's an alternative solution using stringi::stri_replace_all_regex
:
> G14 |>
+ transform(
+ Behavior_cat=stringi::stri_replace_all_regex(
+ Behavior,
+ list(c('slow approach|fin raise|fast approach|tail beat|ram|bite'),
+ c('flee|avoid|tail quiver'),
+ c('bump|join'), c('random behavior')),
+ list('aggressive', 'submissive', 'affiliative', NA_character_),
+ vectorize_all=FALSE)
+ )
Behavior Behavior_cat
1 slow approach aggressive
2 fin raise aggressive
3 fast approach aggressive
4 tail beat aggressive
5 ram aggressive
6 bite aggressive
7 flee submissive
8 avoid submissive
9 tail quiver submissive
10 bump affiliative
11 join affiliative
12 random behavior <NA>
Note, that these also match word parts so far. To only match whole words, include boundary metacharacters, or ^
and $
to denote start and end of a pattern, as shown e.g. in this answer.
Data:
> dput(G14)
structure(list(Behavior = c("slow approach", "fin raise", "fast approach",
"tail beat", "ram", "bite", "flee", "avoid", "tail quiver", "bump",
"join", "random behavior"), Behavior_cat = c("aggressive", "aggressive",
"aggressive", "aggressive", "aggressive", "aggressive", "aggressive",
"aggressive", "aggressive", "aggressive", "aggressive", "aggressive"
)), row.names = c(NA, -12L), class = "data.frame")
1) match_case We can use case_match
from dplyr. It takes a first argument which is a vector containing codes followed by arguments which are formulas with the
possible codes on the left hand side and the replacements on the right.
library(dplyr)
G14 %>%
mutate(Behavioral.category = case_match(Behavior,
c("slow approach", "fin raise", "fast approach", "bite") ~ "aggressive",
c("flee", "avoid", "tail quiver") ~ "submissive",
c("bump", "join") ~ "affiliative")
)
giving the following using the input in the Note at the end
Behavior Behavioral.category
1 slow approach aggressive
2 fin raise aggressive
3 fast approach aggressive
4 bite aggressive
5 flee submissive
6 avoid submissive
7 tail quiver submissive
8 bump affiliative
9 join affiliative
2) fct_collapse First create a list L
whose names are the replacement codes and whose values are vectors of existing codes and then use that with fct_collapse
.
library(dplyr)
library(forcats)
L <- list(
aggressive = c("slow approach", "fin raise", "fast approach", "bite"),
submissive = c("flee", "avoid", "tail quiver"),
affiliative = c("bump", "join")
)
G14 %>% mutate(Behavior.category = fct_collapse(Behavior, !!!L))
3) left_join We can also use left_join
with L
defined above.
library(dplyr)
G14 %>%
left_join(stack(L), join_by(Behavior == values)) %>%
rename(Behavior.Category = ind)
4) Base R Using match
with L
from above we can obtain a Base R approach.
stk <- stack(L)
G14 |> transform(Behavior.category = stk$ind[match(Behavior, stk$values)])
Note
Input data used
G14 <- data.frame(Behavior = c("slow approach", "fin raise", "fast approach",
"bite", "flee", "avoid", "tail quiver", "bump", "join"))
%in%
and==
? to better understand why%in%
is optimal here over==
. Good luck and happy coding! – jpsmith Commented 2 days ago