I have the following regular expressions:
var regEx = /^\W*(.*?)\W*$/;
var regEx2 = /^\W*(.*)\W*$/;
- What does
(.*?)
actually mean? What's the difference between(.*?)
and(.*)
? - Why does
regEx.exec("abc ")
returns['abc ', 'abc']
in Javascript? - Why does
regEx2.exec("abc ")
returns['abc ', 'abc ']
in Javascript?
I have the following regular expressions:
var regEx = /^\W*(.*?)\W*$/;
var regEx2 = /^\W*(.*)\W*$/;
- What does
(.*?)
actually mean? What's the difference between(.*?)
and(.*)
? - Why does
regEx.exec("abc ")
returns['abc ', 'abc']
in Javascript? - Why does
regEx2.exec("abc ")
returns['abc ', 'abc ']
in Javascript?
- (.*?) - This means group of any characters, and its a non-greedy match. i.e. It does not try to match everything it can, it stops as soon as it finds the first one. – 18bytes Commented Jul 10, 2012 at 5:39
3 Answers
Reset to default 6Adding
?
after quantifier*
,+
,{n,m}
, etc. makes reluctant/lazy matching, as opposed to the default greedy matching. It's quite intuitive from the name. Greedy means it will try to match as many as possible. Lazy means that it will try to match as few as possible.There is no non-word
\W
token, so\W*
matches empty string. Then(.*?)
will match as few as possible but checking whether\W*
can match something. So(.*?)
will match and capture"abc"
, and\W*
(non-word) will match the space.Almost the same as above, but
(.*)
will eat up as much as possible and will match and capture"abc "
, and\W*
will be left with empty string, which it matches.
For 2 and 3, the 2nd element in the return array is the captured text by the first capturing group in the regex. The 1st element in the array is the text that matches the entire regex.
What does (.*?) actually mean?
Non-greedily match any character zero or more times, in a matching group.
Why does regEx.exec("abc ") returns ['abc ', 'abc'] in Javascript?
You get one member of the array for each matching group. The element at index 0 is the entire match, the next element is from the first (and only) matching group above.
Why does regEx2.exec("abc ") returns ['abc ', 'abc '] in Javascript?
For the same reason as above, except this time, the greedy match will match the space at the end as well, so your first capture group is identical to the full match in this case.
Okay, the easiest thing to do when looking at regular expressions I find is to break them down and write out what each part is doing.
So taking the first regular expression /^\W*(.*?)\W*$/
^ Start of search string
\W* Match a non-word character zero or more times
( Start of group
.*? Match any character (except a line terminator) zero or more times but as few as possible
) End of group
\W* Match a non-word character zero or more times
$ End of search string
The exec
method searches the text and returns an array of strings (or null if it fails). The string at element 0 is the substring matched by the entire expression, strings after this are those which correspond to the individual capture groups.
So for your first example, the entire expression is capturing "abc "
but the (.*?)
group is capturing "abc"
and so you get two items in your array