I have this JavaScript regex:
/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/
but I would like to exclude some of the letters in a-zA-Z
, namely qvxQVX
. How do I change the regex to achieve this?
I have this JavaScript regex:
/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/
but I would like to exclude some of the letters in a-zA-Z
, namely qvxQVX
. How do I change the regex to achieve this?
5 Answers
Reset to default 9You can still do ranges, but you'll have to do ranges that exclude those letters, so something like A-PR-UWYZ
The best way to do this is to simply update the range to exclude the letters you don't want. That would leave you with this:
/^[a-pr-uwyzA-PR-UWYZęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/
You can pull off a form of character class subtraction using negative lookahead. However, it will be less efficient since you are repeating the negative lookahead for every matched string. In any case, here's what that would look like:
/^(?:(?![qvxQVX])[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ])+$/
This works best when you're not repeating a character class an unlimited number of times like this.
Several regex flavors including Java and .NET efficiently support character class subtraction using special syntax.
In Java, intersect with a negated group:
/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ&&[^qvxQVX]]+$/
A little known fact is that the Opera web browser actually supports the above Java syntax in its native JavaScript regular expressions. Opera might remove this feature in the future since it is nonstandard (it's based on abandoned ES4 proposals), but it works in the current version (v11.64), at least.
.NET, XPath, and XML Schema support the following, simpler syntax for character class subtraction:
/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ-[qvxQVX]]+$/
You cannot. In this case you need to enumerate manually all the letters except of the excluded QVXqvx
No more lookaround-based workarounds are necessary with the newly introduced v
flag (see the feature proposal that is at Stage 4 of the TC39 process as of the May 16, 2023).
If your JavaScript envrionment supports the v
flag, you can use --
operator in the character class to perform character class subtraction. Here is an example that matches all Greek letters but Pi:
console.log(/[\p{Script_Extensions=Greek}--π]/v.test('π')) // => false
console.log(/[\p{Script_Extensions=Greek}]/v.test('π')) // => true
In your case, to "exclude" Q
, V
, X
, q
, v
and x
from a a-zA-Z
range you can use nested character classes:
console.log(Array.from(
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'.matchAll(
/[[A-Za-z]--[QVXqvx]]/vg)).flat().join(""))
// => ABCDEFGHIJKLMNOPRSTUWYZabcdefghijklmnoprstuwyz
As of May 19, 2023, V8 v11.0 (Chrome 110) offers experimental support for this new functionality via the --harmony-regexp-unicode-sets
flag.
/^[a-pA-PR-Ur-uWwYyZzęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/