最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Complex regular expression - Stack Overflow

programmeradmin1浏览0评论

I am trying to find a regular expression which will do the following (working in Javascript). I want to take a string which contains some tokens like (token) inside parentheses. My aim is to capture the tokens (including the parentheses). I will assume that parenthese are not nested, and that every open parenthesis is eventually closed.

The regular expression I would use is

[[^\(\)]*|(\(.*?\))]*

Let me break it down:

[            # Either of two things:
  [^\(\)]*   # the first is a substring not containing parentheses
|
  (          # the second is to be captured...
    \(.*?\)  # and should contain anything in parentheses - lazy match
  )
]*           # Any number of these blocks can appear

Needless to say, this will not work (why would I be asking here otherwise?):

var a = /[[^\(\)]*|(\(.*?\))]*/;
a.exec('foo(bar)');

It fails both in Firefox and Node. My previous attempt was a slightly more picated regex:

(?:[^\(\)]*(\(.*?\)))*[^\(\)]*

which can be described as follows

(?:              # A non-capturing group...
  [^\(\)]*       # ...containing any number of non-parentheses chars
  (\(.*?\))      # ...followed by a captured token inside parentheses.
)*               # There can be any number of such groups
[^\(\)]*         # Finally, any number of non-parentheses, as above

This will work on foo(bar), but will fail on foo(bar)(quux), catpuring only quux.

How should I fix the above regex?

I am trying to find a regular expression which will do the following (working in Javascript). I want to take a string which contains some tokens like (token) inside parentheses. My aim is to capture the tokens (including the parentheses). I will assume that parenthese are not nested, and that every open parenthesis is eventually closed.

The regular expression I would use is

[[^\(\)]*|(\(.*?\))]*

Let me break it down:

[            # Either of two things:
  [^\(\)]*   # the first is a substring not containing parentheses
|
  (          # the second is to be captured...
    \(.*?\)  # and should contain anything in parentheses - lazy match
  )
]*           # Any number of these blocks can appear

Needless to say, this will not work (why would I be asking here otherwise?):

var a = /[[^\(\)]*|(\(.*?\))]*/;
a.exec('foo(bar)');

It fails both in Firefox and Node. My previous attempt was a slightly more picated regex:

(?:[^\(\)]*(\(.*?\)))*[^\(\)]*

which can be described as follows

(?:              # A non-capturing group...
  [^\(\)]*       # ...containing any number of non-parentheses chars
  (\(.*?\))      # ...followed by a captured token inside parentheses.
)*               # There can be any number of such groups
[^\(\)]*         # Finally, any number of non-parentheses, as above

This will work on foo(bar), but will fail on foo(bar)(quux), catpuring only quux.

How should I fix the above regex?

Share Improve this question edited May 19, 2011 at 20:31 nohat 7,29110 gold badges41 silver badges44 bronze badges asked May 19, 2011 at 16:11 AndreaAndrea 20.5k25 gold badges117 silver badges186 bronze badges 1
  • good thing you assumed non-nesting, otherwise it wouldn't be a regular problem – Adam Bergmark Commented May 19, 2011 at 16:18
Add a ment  | 

4 Answers 4

Reset to default 4

You can't have an arbitrary amount of capture groups in a regex. use the /g flag to acplish this instead: s.match(/\([^\)]+\)/g)

This works find - tested in Chrome

<your string here>.match(/(\(.*?\))/g)

It returns an array of matches:

str = 'Content(cap)(cap2)(cap3)'
str.match(/(\(.*?\))/g)
-> ["(cap)", "(cap2)", "(cap3)"]

If your aim is to capture the tokens inside the parenthesis (including the delimiters) then a simple regular expression like:

\([^)]*?\)

will work.

var a= /\([^)]+\)/g;

发布评论

评论列表(0)

  1. 暂无评论