最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Regular Expression "AND" - Stack Overflow

programmeradmin4浏览0评论

I'm doing some basic text matching from an input. I need the ability to perform a basic "AND". For "ANY" I split the input by spaces and join each word by the pipe ("|") character but I haven't found a way to tell the regular expression to match any of the words.

switch (searchOption) {
  case "any":
    inputArray = input.split(" ");
    if (inputArray.length > 1) { input = inputArray.join("|"); }
    text = input;
    break;
  case "all":
    inputArray = input.split(" ");
    ***[WHAT TO DO HERE?]***
    text = input;
    break;
  case "exact":
    inputArray = new Array(input);
    text = input;
    break;
}

Seems like it should be easy.

I'm doing some basic text matching from an input. I need the ability to perform a basic "AND". For "ANY" I split the input by spaces and join each word by the pipe ("|") character but I haven't found a way to tell the regular expression to match any of the words.

switch (searchOption) {
  case "any":
    inputArray = input.split(" ");
    if (inputArray.length > 1) { input = inputArray.join("|"); }
    text = input;
    break;
  case "all":
    inputArray = input.split(" ");
    ***[WHAT TO DO HERE?]***
    text = input;
    break;
  case "exact":
    inputArray = new Array(input);
    text = input;
    break;
}

Seems like it should be easy.

Share Improve this question edited Aug 21, 2009 at 14:58 Gavin Miller 43.9k22 gold badges127 silver badges191 bronze badges asked Aug 21, 2009 at 14:56 webwireswebwires 1,9073 gold badges26 silver badges46 bronze badges 1
  • You mean 'to match all of the words'? – wds Commented Aug 21, 2009 at 15:02
Add a ment  | 

4 Answers 4

Reset to default 7

Use lookahead. Try this:

if( inputArray.length>1 ) rgx = "(?=.*" + inputArray.join( ")(?=.*" ) + ").*";

You'll end up with something like

(?=.*dog)(?=.*cat)(?=.*mouse).*

Which should only match if all the words appear, but they can be in any order.

  • The dog ate the cat who ate the mouse.
  • The mouse was eaten by the dog and the cat.
  • Most cats love mouses and dogs.

But not

  • The dog at the mouse.
  • Cats and dogs like mice.

The way it works is that the regex engine scans from the current match point (0) looking for .*dog, the first sub-regex (any number of any character, followed by dog). When it determines true-ness of that regex, it resets the match point (back to 0) and continues with the next sub-regex. So, the net is that it doesn't matter where each word is; only that every word is found.

EDIT: @Justin pointed out that i should have a trailing .*, which i've added above. Without it, text.match(regex) works, but regex.exec(text) returns an empty match string. With the trailing .*, you get the matching string.

The problem with "and" is: in what bination do you want the words? Can they appear in any order, or must they be in the order given? Can they appear consecutively or can there be other words in between?

These decisions impact heavily what search (or searches) you do.

If you're looking for "A B C" (in order, consecutively), the expression is simply /A B C/. Done!

If you're looking for "A foo B bar C" it might be /A.*?B.*?C/

If you're looking for "B foo A foo C" you'd be better off doing three separate tests for /A/, /B/, and /C/

Do a simple for loop and search for every term, something like this:

var n = inputArray.length;
if (n) {
    for (var i=0; i<n; i++) {
        if (/* inputArray[i] not in text */) {
            break;
        }
    }
    if (i != n) {
        // not all terms were found
    }
}

My regular expressions cookbook does feature a regular expression that can possibly do this using conditionals. However, it's quite plicated, so I'd go for the currently top rated answer which is iterating over the options. Anyway, trying to adapt their example I think it would be something like:

\b(?:(?:(word1)|(word2))(\b.*?)){2,}(?(1)|(?!))(?(2)|(?!))

No guarantees that this'll work as is, but it's the basic idea I think. See what I mean about plicated?

发布评论

评论列表(0)

  1. 暂无评论