I have a RegExp like the following simplified example:
var exp = /he|hell/;
When I run it on a string it will give me the first match, fx:
var str = "hello world";
var match = exp.exec(str);
// match contains ["he"];
I want the first and longest possible match, and by that i mean sorted by index, then length.
Since the expression is combined from an array of RegExp's, I am looking for a way to find the longest match without having to rewrite the regular expression.
Is that even possible?
If it isn't, I am looking for a way to easily analyze the expression, and arrange it in the proper order. But I can't figure out how since the expressions could be a lot more complex, fx:
var exp = /h..|hel*/
I have a RegExp like the following simplified example:
var exp = /he|hell/;
When I run it on a string it will give me the first match, fx:
var str = "hello world";
var match = exp.exec(str);
// match contains ["he"];
I want the first and longest possible match, and by that i mean sorted by index, then length.
Since the expression is combined from an array of RegExp's, I am looking for a way to find the longest match without having to rewrite the regular expression.
Is that even possible?
If it isn't, I am looking for a way to easily analyze the expression, and arrange it in the proper order. But I can't figure out how since the expressions could be a lot more complex, fx:
var exp = /h..|hel*/
Share
Improve this question
edited Jan 22, 2010 at 7:44
Michael Andersen
asked Jan 21, 2010 at 13:56
Michael AndersenMichael Andersen
9892 gold badges11 silver badges20 bronze badges
3
|
5 Answers
Reset to default 5How about /hell|he/
?
All regex implementations I know of will (try to) match characters/patterns from left to right and terminate whenever they find an over-all match.
In other words: if you want to make sure you get the longest possible match, you'll need to try all your patterns (separately), store all matches and then get the longest match from all possible matches.
You can do it. It's explained here: http://www.regular-expressions.info/alternation.html
(In summary, change the operand order or group with question mark the second part of the search.)
You cannot do "longest match" (or anything involving counting, minus look-aheads) with regular expressions.
Your best bet is to find all matches, and simply compare the lengths in the program.
I don't know if this is what you're looking for (Considering this question is almost 8 years old...), but here's my grain of salt:
(Switching the he for hell will perform the search based on the biggest first)
var exp = /hell|he/;
var str = "hello world";
var match = exp.exec(str);
if(match)
{
match.sort(function(a, b){return b.length - a.length;});
console.log(match[0]);
}
Where match[0] is going to be the longest of all the strings matched.
/h....|hel*/
– Mark Byers Commented Jan 21, 2010 at 14:00/h.*?|hello/
. But i guess the users of this site knows what I mean anyway. At least you did :-) – Michael Andersen Commented Jan 21, 2010 at 14:06exp = /.*(?<=h..|hel*)/
. But so far this feature is not expected in JS. – Antony Hatchkins Commented Jan 21, 2010 at 15:43