最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Regex to match all words except those in parentheses - javascript - Stack Overflow

programmeradmin2浏览0评论

I'm using the following regex to match all words:

mystr.replace(/([^\W_]+[^\s-]*) */g, function (match, p1, index, title) {...}

Note that words can contain special characters like German Umlauts. How can I match all words excluding those inside parentheses?

If I have the following string:

here wäre c'è (don't match this one) match this

I would like to get the following output:

here
wäre
c'è
match
this

The trailing spaces don't really matter. Is there an easy way to achieve this with regex in javascript?

EDIT: I cannot remove the text in parentheses, as the final string "mystr" should also contain this text, whereas string operations will be performed on text that matches. The final string contained in "mystr" could look like this:

Here Wäre C'è (don't match this one) Match This

I'm using the following regex to match all words:

mystr.replace(/([^\W_]+[^\s-]*) */g, function (match, p1, index, title) {...}

Note that words can contain special characters like German Umlauts. How can I match all words excluding those inside parentheses?

If I have the following string:

here wäre c'è (don't match this one) match this

I would like to get the following output:

here
wäre
c'è
match
this

The trailing spaces don't really matter. Is there an easy way to achieve this with regex in javascript?

EDIT: I cannot remove the text in parentheses, as the final string "mystr" should also contain this text, whereas string operations will be performed on text that matches. The final string contained in "mystr" could look like this:

Here Wäre C'è (don't match this one) Match This
Share Improve this question edited Oct 15, 2012 at 16:06 thomasf asked Oct 15, 2012 at 11:36 thomasfthomasf 531 silver badge5 bronze badges 4
  • 1 I don't think that it is possible using single regex, probably you'll need to cut parentheses with their content first. – Sergey Rybalkin Commented Oct 15, 2012 at 11:44
  • Do you need to account for nested (like this (or even this)) parentheses? If so, you will have to impose an upper bound on the nesting or go to a non-RE-based solution. – Vatine Commented Oct 15, 2012 at 16:21
  • No need to account for nested parentheses. There can be several parenteses, but they will not be nested. e.g. "(like this) and like (this)" – thomasf Commented Oct 16, 2012 at 8:08
  • I accept Fabrizio's answer as it was correct before making my question more specific. To solve my problem I will search for the opening and closing parens inside the callback function. That's not as nice as a regex but it works well. – thomasf Commented Oct 16, 2012 at 12:38
Add a ment  | 

2 Answers 2

Reset to default 4

Try this:

var str = "here wäre c'è (don't match this one) match this";

str.replace(/\([^\)]*\)/g, '')  // remove text inside parens (& parens)
   .match(/(\S+)/g);            // match remaining text

// ["here", "wäre", "c'è", "match", "this"]

Thomas, resurrecting this question because it had a simple solution that wasn't mentioned and that doesn't require replacing then matching (one step instead of two steps). (Found your question while doing some research for a general question about how to exclude patterns in regex.)

Here's our simple regex (see it at work on regex101, looking at the Group captures in the bottom right panel):

\(.*?\)|([^\W_]+[^\s-]*)

The left side of the alternation matches plete (parenthesized phrases). We will ignore these matches. The right side matches and captures words to Group 1, and we know they are the right words because they were not matched by the expression on the left.

This program shows how to use the regex (see the matches in the online demo):

<script>
var subject = 'here wäre c\'è (don\'t match this one) match this';
var regex = /\(.*?\)|([^\W_]+[^\s-]*)/g;
var group1Caps = [];
var match = regex.exec(subject);

// put Group 1 captures in an array
while (match != null) {
    if( match[1] != null ) group1Caps.push(match[1]);
    match = regex.exec(subject);
}

document.write("<br>*** Matches ***<br>");
if (group1Caps.length > 0) {
   for (key in group1Caps) document.write(group1Caps[key],"<br>");
   }

</script>

Reference

How to match (or replace) a pattern except in situations s1, s2, s3...

发布评论

评论列表(0)

  1. 暂无评论