I am trying to find a regular expression which will do the following (working in Javascript). I want to take a string which contains some tokens like (token)
inside parentheses. My aim is to capture the tokens (including the parentheses). I will assume that parenthese are not nested, and that every open parenthesis is eventually closed.
The regular expression I would use is
[[^\(\)]*|(\(.*?\))]*
Let me break it down:
[ # Either of two things:
[^\(\)]* # the first is a substring not containing parentheses
|
( # the second is to be captured...
\(.*?\) # and should contain anything in parentheses - lazy match
)
]* # Any number of these blocks can appear
Needless to say, this will not work (why would I be asking here otherwise?):
var a = /[[^\(\)]*|(\(.*?\))]*/;
a.exec('foo(bar)');
It fails both in Firefox and Node. My previous attempt was a slightly more picated regex:
(?:[^\(\)]*(\(.*?\)))*[^\(\)]*
which can be described as follows
(?: # A non-capturing group...
[^\(\)]* # ...containing any number of non-parentheses chars
(\(.*?\)) # ...followed by a captured token inside parentheses.
)* # There can be any number of such groups
[^\(\)]* # Finally, any number of non-parentheses, as above
This will work on foo(bar)
, but will fail on foo(bar)(quux)
, catpuring only quux.
How should I fix the above regex?
I am trying to find a regular expression which will do the following (working in Javascript). I want to take a string which contains some tokens like (token)
inside parentheses. My aim is to capture the tokens (including the parentheses). I will assume that parenthese are not nested, and that every open parenthesis is eventually closed.
The regular expression I would use is
[[^\(\)]*|(\(.*?\))]*
Let me break it down:
[ # Either of two things:
[^\(\)]* # the first is a substring not containing parentheses
|
( # the second is to be captured...
\(.*?\) # and should contain anything in parentheses - lazy match
)
]* # Any number of these blocks can appear
Needless to say, this will not work (why would I be asking here otherwise?):
var a = /[[^\(\)]*|(\(.*?\))]*/;
a.exec('foo(bar)');
It fails both in Firefox and Node. My previous attempt was a slightly more picated regex:
(?:[^\(\)]*(\(.*?\)))*[^\(\)]*
which can be described as follows
(?: # A non-capturing group...
[^\(\)]* # ...containing any number of non-parentheses chars
(\(.*?\)) # ...followed by a captured token inside parentheses.
)* # There can be any number of such groups
[^\(\)]* # Finally, any number of non-parentheses, as above
This will work on foo(bar)
, but will fail on foo(bar)(quux)
, catpuring only quux.
Share Improve this question edited May 19, 2011 at 20:31 nohat 7,29110 gold badges41 silver badges44 bronze badges asked May 19, 2011 at 16:11 AndreaAndrea 20.5k25 gold badges117 silver badges186 bronze badges 1How should I fix the above regex?
- good thing you assumed non-nesting, otherwise it wouldn't be a regular problem – Adam Bergmark Commented May 19, 2011 at 16:18
4 Answers
Reset to default 4You can't have an arbitrary amount of capture groups in a regex. use the /g flag to acplish this instead: s.match(/\([^\)]+\)/g)
This works find - tested in Chrome
<your string here>.match(/(\(.*?\))/g)
It returns an array of matches:
str = 'Content(cap)(cap2)(cap3)'
str.match(/(\(.*?\))/g)
-> ["(cap)", "(cap2)", "(cap3)"]
If your aim is to capture the tokens inside the parenthesis (including the delimiters) then a simple regular expression like:
\([^)]*?\)
will work.
var a= /\([^)]+\)/g;