I want a regex matching a specific word that is not surrounded by any alphanumeric character. My thought was to include a negation before and after:
[^a-zA-Z\d]myspecificword[^a-zA-Z\d]
So it would match:
myspecificword
_myspecificword_
-myspecificword
And not match:
notmyspecificword
myspecificword123
But this simple regex won't match the word by itself unless it is preceeded by a whitespace:
myspecificword // no match
myspecificword // match
Using the flags "gmi" and testing with JavaScript. What am I doing wrong? Shouldn't it be as simple as that?
I want a regex matching a specific word that is not surrounded by any alphanumeric character. My thought was to include a negation before and after:
[^a-zA-Z\d]myspecificword[^a-zA-Z\d]
So it would match:
myspecificword
_myspecificword_
-myspecificword
And not match:
notmyspecificword
myspecificword123
But this simple regex won't match the word by itself unless it is preceeded by a whitespace:
myspecificword // no match
myspecificword // match
Using the flags "gmi" and testing with JavaScript. What am I doing wrong? Shouldn't it be as simple as that?
https://regex101./r/BCkbVQ/3
Share Improve this question asked Feb 28, 2020 at 16:04 GRoutarGRoutar 1,4352 gold badges19 silver badges39 bronze badges 5-
1
Regex word boundary is
\b
– user47589 Commented Feb 28, 2020 at 16:05 - @Amy That won't work because underscore is considered a word character. – Barmar Commented Feb 28, 2020 at 16:08
- Use negative lookarounds. regular-expressions.info/lookaround.html – Barmar Commented Feb 28, 2020 at 16:08
-
3
It doesn't work because
[^a-zA-Z\d]
needs to match an actual character. There's no character at the beginning or end. – Barmar Commented Feb 28, 2020 at 16:09 -
/(?<![a-z\d])myspecificword(?![a-z\d])/ig
should work but keep in mind lookbehind is not supported in older browsers. – anubhava Commented Feb 28, 2020 at 16:16
5 Answers
Reset to default 4Try using:
(?<![^\s_-])myspecificword(?![^\s_-])
This says to match myspecificword
when it surrounded, on both sides, by either the start/end of the input, whitespace, underscore, or dash.
Demo
It is not whitespace that is required but any symbol that is matches [^a-zA-Z\d]
.
You should use: (Demo)
(?:^|[^a-zA-Z\d])myspecificword(?:[^a-zA-Z\d]|$)
The main benefit is support across all Regexp parsers.
If you truly mean "not surrounded by alphanumerics other than _
(and in your attempted regex you seem to be willing to match anything that isn't a letter or digit), then any of the following should be acceptable:
'myspecificword'
'_myspecificword_'
' myspecificword '
'-myspecificword-'
'(myspecificword)'
And the regex should be:
(?<![^_\W])myspecificword(?![^_\W])
let tests = ['myspecificword',
'_myspecificword_',
' myspecificword ',
'-myspecificword-',
'(myspecificword)',
'amyspecificword',
'1myspecificword'
];
let regex = /(?<![^_\W])myspecificword(?![^_\W])/;
for (let test of tests) {
console.log(regex.test(test));
}
The "accepted" answer will not match (myspecificword)
, for example.
The title of this question is
Regex for word not surrounded by alphanumeric characters
The other answers have all addressed a different question (which may well be the one intended):
Regex for word neither preceded nor followed by alphanumeric characters
I will refer to these statements as #1 and #2 respectively.
If the specified word were 'cat'
and the string were '9cat'
, 'cat'
is not surrounded by alphanumeric characters in the string, so there is a match with #1, but not with #2.
For #1, one could use the regex:
/cat(?!\p{Alpha}|(?<!\p{Alnum})cat/
("match 'cat' not followed by a Unicode alphanumeric character or 'cat' not preceded by a Unicode alphanumeric character"), though it's easier to test for the negation:
/(?<=\p{Alpha}cat(?<=\p{Alnum})/
The test passes if the string does not match this regex.
With interpretation #2, the regex is:
/(?<!\p{Alpha}cat(?!\p{Alnum})/
I think this will work:
/[^a-z0-9]?myspesificword[^a-z0-9]?/i