最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

JavaScript regex character exclusion - Stack Overflow

programmeradmin8浏览0评论

I have this JavaScript regex:

/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/

but I would like to exclude some of the letters in a-zA-Z, namely qvxQVX. How do I change the regex to achieve this?

I have this JavaScript regex:

/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/

but I would like to exclude some of the letters in a-zA-Z, namely qvxQVX. How do I change the regex to achieve this?

Share Improve this question edited Jun 12, 2023 at 13:30 user3064538 asked May 27, 2012 at 22:27 Mariusz GrodekMariusz Grodek 6394 gold badges12 silver badges26 bronze badges 0
Add a ment  | 

5 Answers 5

Reset to default 9

You can still do ranges, but you'll have to do ranges that exclude those letters, so something like A-PR-UWYZ

The best way to do this is to simply update the range to exclude the letters you don't want. That would leave you with this:

/^[a-pr-uwyzA-PR-UWYZęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/

You can pull off a form of character class subtraction using negative lookahead. However, it will be less efficient since you are repeating the negative lookahead for every matched string. In any case, here's what that would look like:

/^(?:(?![qvxQVX])[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ])+$/

This works best when you're not repeating a character class an unlimited number of times like this.

Several regex flavors including Java and .NET efficiently support character class subtraction using special syntax.

In Java, intersect with a negated group:

/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ&&[^qvxQVX]]+$/

A little known fact is that the Opera web browser actually supports the above Java syntax in its native JavaScript regular expressions. Opera might remove this feature in the future since it is nonstandard (it's based on abandoned ES4 proposals), but it works in the current version (v11.64), at least.

.NET, XPath, and XML Schema support the following, simpler syntax for character class subtraction:

/^[a-zA-ZęóąśłżźćńĘÓĄŚŁŻŹĆŃ-[qvxQVX]]+$/

You cannot. In this case you need to enumerate manually all the letters except of the excluded QVXqvx

No more lookaround-based workarounds are necessary with the newly introduced v flag (see the feature proposal that is at Stage 4 of the TC39 process as of the May 16, 2023).

If your JavaScript envrionment supports the v flag, you can use -- operator in the character class to perform character class subtraction. Here is an example that matches all Greek letters but Pi:

console.log(/[\p{Script_Extensions=Greek}--π]/v.test('π')) // => false
console.log(/[\p{Script_Extensions=Greek}]/v.test('π'))    // => true

In your case, to "exclude" Q, V, X, q, v and x from a a-zA-Z range you can use nested character classes:

console.log(Array.from(
     'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'.matchAll(
               /[[A-Za-z]--[QVXqvx]]/vg)).flat().join(""))
// => ABCDEFGHIJKLMNOPRSTUWYZabcdefghijklmnoprstuwyz

As of May 19, 2023, V8 v11.0 (Chrome 110) offers experimental support for this new functionality via the --harmony-regexp-unicode-sets flag.

/^[a-pA-PR-Ur-uWwYyZzęóąśłżźćńĘÓĄŚŁŻŹĆŃ]+$/

发布评论

评论列表(0)

  1. 暂无评论