最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Replacing some commas with semicolons - R - Stack Overflow

programmeradmin1浏览0评论

I am trying to write a function that will take a column from a dataframe, and replace some commas with semicolons. Taking the example of a value below (i.e., 1 cell from a dataframe's column)

Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)

Turned into

Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)

I've created the function below, but 1) it does not like the current regex (continue to get errors) and 2) it replaces all values with NA (there are some values that have NA, and I'd like to keep them as such, only replacing values with data in the cells).

semicolon <- function(x) {
  as.double(gsub(",(?=[A-Z])",";", x))
}

I am trying to write a function that will take a column from a dataframe, and replace some commas with semicolons. Taking the example of a value below (i.e., 1 cell from a dataframe's column)

Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)

Turned into

Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)

I've created the function below, but 1) it does not like the current regex (continue to get errors) and 2) it replaces all values with NA (there are some values that have NA, and I'd like to keep them as such, only replacing values with data in the cells).

semicolon <- function(x) {
  as.double(gsub(",(?=[A-Z])",";", x))
}
Share Improve this question asked Mar 27 at 18:25 Kayla SchouKayla Schou 1 3
  • 1 Welcome to SO! Please try to include errors and warning messages you encounter. By "some commas" you mean those that follow closing parens? – margusl Commented Mar 27 at 18:46
  • 2 May I ask, what you want to achieve by transforming a string to a double with as.double()? Also in this case, no negative lookaheads are needed, you can also use gsub(")\\, ", ")\\; " , "Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)") – Tim G Commented Mar 27 at 18:59
  • Hi @KaylaSchou, if the answer has solved your question please consider accepting it (also read) by clicking the check-mark. This indicates to the wider community that you've found a solution and gives some reputation to both the answerer and yourself. (There is no obligation to do this.) – r2evans Commented Apr 1 at 0:56
Add a comment  | 

1 Answer 1

Reset to default 3

You need to use a lookbehind to find commas after close parens ("(?<=\\))") -- your current approach ("(?=[A-Z])") uses a lookahead to find commas immediately before a capital letter. You also need to set perl = TRUE to use lookarounds. And it's not clear why you're including as.double(), which will try to coerce the character string to a number and return NA. So:

semicolon <- function(x) {
  gsub("(?<=\\)),", ";", x, perl = TRUE)
}

x <- "Orange (carrot, orange), Blue (sky, ball), Yellow (lemon, boots)"

semicolon(x)
# "Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)"

Edit: As @TimG pointed out in a comment, you don't need the lookbehind if you include the close paren in the replacement:

semicolon <- function(x) gsub("\\),", ");", x)

semicolon(x)
# "Orange (carrot, orange); Blue (sky, ball); Yellow (lemon, boots)"
发布评论

评论列表(0)

  1. 暂无评论