I thought I understand how regex operators work, but now I'm really confused. In simplified example, I have two strings:
mail.wow.no-1
mail.ololo.wow
I want to match first one, NOT the second. And I'm writing regex (simplified version) like this:
^mail\.(.*)(?!\.wow\)$
And when I run in JS method test on both those examples, it simply returns true (in sublime 2 regex search highlights both strings, that means both strings matched)
I know that I can make reverse regex, that will match second and make logic depending on this, but I just want to understand how (?!)
in regex works and what am I doing wrong.
Thanks.
I thought I understand how regex operators work, but now I'm really confused. In simplified example, I have two strings:
mail.wow.no-1.
mail.ololo.wow.
I want to match first one, NOT the second. And I'm writing regex (simplified version) like this:
^mail\.(.*)(?!\.wow\.)$
And when I run in JS method test on both those examples, it simply returns true (in sublime 2 regex search highlights both strings, that means both strings matched)
I know that I can make reverse regex, that will match second and make logic depending on this, but I just want to understand how (?!)
in regex works and what am I doing wrong.
Thanks.
Share Improve this question edited Aug 15, 2013 at 9:51 HamZa 14.9k11 gold badges55 silver badges75 bronze badges asked Aug 15, 2013 at 9:40 alisa kibinalisa kibin 5821 gold badge4 silver badges18 bronze badges 2- 4 regular-expressions.info/lookaround.html – Bergi Commented Aug 15, 2013 at 9:42
- 1 I typically find negative look-behind easier to understand, but unfortunately this is JavaScript :) – Ja͢ck Commented Aug 15, 2013 at 10:03
4 Answers
Reset to default 8You need a .*
inside the lookahead (and move it in front of the .*
outside the lookahead):
^mail(?!.*\.wow\.)\.(.*)$
Otherwise, your lookahead checks only at the end of the string. And obviously, at the end of the string, there can never be a .wow.
. You could just as well move the lookahead to the beginning of the pattern now:
^(?!.*\.wow\.)mail\.(.*)$
Working demo.
This last variant is slightly less efficient, but I find patterns a bit easier to read if lookaheads that affect the entire string are all at the beginning of the pattern.
It's a zero width assertion. It's a negative look-ahead.
What it says it: At this position - the following can not e.
So, for example (?!q)
means, the letter q
can not follow right now.
What you want is
^mail\.(?!.*\.wow\.$).*$
As other have stated, the (?!) is a negative zero-width look-ahead assertion; it does not match any number of characters, but sees into the next characters, and makes sure they do not match what is contained within the parentheses.
The Javascript copied the regular expression syntax from Perl; these are generally known as PCRE, or Perl-Compatible Regular Expressions; however Javascript only has lookaheads, that is they see from this point into the future; Perl also has negative zero-width look-behind, that would work like your original example, and easier in this case
# this is how it could be done in Perl
^mail\..*(?<!\.wow\.)$
However Javascript has opted to not support lookbehinds.
An examination of this url is that it looks for "mail." then as many characters as possible, then checks if between that (i.e. the end of the string) and the end of the screen if ".wow." exists.
^mail\.(.*)(?!\.wow\.)$
Instead, reorder it so that it checks if ".wow." es after "mail.".
^mail\.(?!.*\.wow\.)(.*)$