最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

regex - Using javascript regexp to find the first AND longest match - Stack Overflow

programmeradmin4浏览0评论

I have a RegExp like the following simplified example:

var exp = /he|hell/;

When I run it on a string it will give me the first match, fx:

var str = "hello world";
var match = exp.exec(str);
// match contains ["he"];

I want the first and longest possible match, and by that i mean sorted by index, then length.

Since the expression is combined from an array of RegExp's, I am looking for a way to find the longest match without having to rewrite the regular expression.

Is that even possible?

If it isn't, I am looking for a way to easily analyze the expression, and arrange it in the proper order. But I can't figure out how since the expressions could be a lot more complex, fx:

var exp = /h..|hel*/

I have a RegExp like the following simplified example:

var exp = /he|hell/;

When I run it on a string it will give me the first match, fx:

var str = "hello world";
var match = exp.exec(str);
// match contains ["he"];

I want the first and longest possible match, and by that i mean sorted by index, then length.

Since the expression is combined from an array of RegExp's, I am looking for a way to find the longest match without having to rewrite the regular expression.

Is that even possible?

If it isn't, I am looking for a way to easily analyze the expression, and arrange it in the proper order. But I can't figure out how since the expressions could be a lot more complex, fx:

var exp = /h..|hel*/
Share Improve this question edited Jan 22, 2010 at 7:44 Michael Andersen asked Jan 21, 2010 at 13:56 Michael AndersenMichael Andersen 9892 gold badges11 silver badges20 bronze badges 3
  • Your second example would be a lot more interesting if it were for example: /h....|hel*/ – Mark Byers Commented Jan 21, 2010 at 14:00
  • It looks the same to me. I actually wanted to illustrate that the longest regexp was not necessarily the longest match. My simple expression should have been something like /h.*?|hello/. But i guess the users of this site knows what I mean anyway. At least you did :-) – Michael Andersen Commented Jan 21, 2010 at 14:06
  • If variable-width lookbehind assertions were possible in javascript (as they are for example in .NET and JGsoft regex flavours) you could achieve it this way: exp = /.*(?<=h..|hel*)/ . But so far this feature is not expected in JS. – Antony Hatchkins Commented Jan 21, 2010 at 15:43
Add a comment  | 

5 Answers 5

Reset to default 5

How about /hell|he/ ?

All regex implementations I know of will (try to) match characters/patterns from left to right and terminate whenever they find an over-all match.

In other words: if you want to make sure you get the longest possible match, you'll need to try all your patterns (separately), store all matches and then get the longest match from all possible matches.

You can do it. It's explained here: http://www.regular-expressions.info/alternation.html

(In summary, change the operand order or group with question mark the second part of the search.)

You cannot do "longest match" (or anything involving counting, minus look-aheads) with regular expressions.

Your best bet is to find all matches, and simply compare the lengths in the program.

I don't know if this is what you're looking for (Considering this question is almost 8 years old...), but here's my grain of salt:

(Switching the he for hell will perform the search based on the biggest first)

var exp = /hell|he/;
var str = "hello world";
var match = exp.exec(str);

if(match)
{
  match.sort(function(a, b){return b.length - a.length;});            
  console.log(match[0]);
 }

Where match[0] is going to be the longest of all the strings matched.

发布评论

评论列表(0)

  1. 暂无评论