最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Javascript Regex ignore case for specific capture group - Stack Overflow

programmeradmin1浏览0评论

In PCRE this would be a valid expression

/^\!(foo|bar) ((?i)ab|cd|ef|gh)$/

But in JavaScript Regex this is not valid. Unfortunately I'm not aware what (?i) is called so I'm having some trouble googleing it. How would I translate this given example to be valid in JavaScript?


What I actually want to do:

find all lines which start with !foo or !bar followed by a space and end with ab, cd, ef or gh. The latter should be case insensitive.

!foo CD
!foo cD
!foo cd

would all be valid. While

!FOO cd !Foo cd

would be invalid

In PCRE this would be a valid expression

/^\!(foo|bar) ((?i)ab|cd|ef|gh)$/

But in JavaScript Regex this is not valid. Unfortunately I'm not aware what (?i) is called so I'm having some trouble googleing it. How would I translate this given example to be valid in JavaScript?


What I actually want to do:

find all lines which start with !foo or !bar followed by a space and end with ab, cd, ef or gh. The latter should be case insensitive.

!foo CD
!foo cD
!foo cd

would all be valid. While

!FOO cd !Foo cd

would be invalid

Share Improve this question edited Jan 4, 2016 at 2:30 boop asked Jan 4, 2016 at 2:24 boopboop 7,78815 gold badges58 silver badges100 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 15

The (?i) is the case-insensitive flag: starting from the point inside your regex where it is placed it makes all the character class containing letter e.g. [a-z] to matches also [A-Z] (and viceversa). This works also for a single letter a (matches a and A) or sequence ab (matches ab,Ab,aB,AB).

So you can put it at the beginning of your regex /(?i)regex/ (making it equivalent to the js /regex/i) or you can use it together with its opposite (?-i) to make only some section of the regex case-insensitive:

/^(?i)[a-z]{2}(?-i)[a-z]{2}/ 

The regex above matches 2 uppercase or lowercase chars plus 2 strictly lowercase chars.

Matches ->   ROck, rOck, Rock
Not Matches -> ROCK, roCk, rOcK

What about your PCRE regex?

/^\!(foo|bar) ((?i)ab|cd|ef|gh)$/

If you don't mind to match also a string starting with !Foo,!FOo,!foO,!fOO,!BAR,!bar,... you can put the flag outside, as this:

/^!(foo|bar) (ab|cd|ef|gh)$/i # you can also remove the escape from \! -> !

If you want instead the exact equivalent of the original PCRE regex (/^!(foo|bar) ((?i)ab|cd|ef|gh)$/) the equivalent js regex is the less readable:

/^!(foo|bar) ([Aa][Bb]|[Cc][Dd]|[Ee][Ff]|[Gg][Hh])$/

You can download the ECMAScript (JavaScript) documentation from here:

https://www.ecma-international.org/publications/standards/Ecma-262.htm

The RegExp is clearly defined there and it is not based on advanced Perl rules. So the (?...) syntax is not supported (see update below, that works in newer browsers).

One way to do what you want is to use the [...] for each character that need to be upper/lower case:

(?i)ab

becomes

[aA][bB]

It's a lot more typing, but I do not know of a better solution.

If the entire regex could be in any case, then you could use the flag:

/ab/i

But in your example, that means "foo" would also be accepted as "Foo" or "fOO".


Update

Newer versions of JavaScript do support the (<flag>?...) syntax.

  • DotAll is true if the RegExp object's [[OriginalFlags]] internal slot contains "s" and otherwise is false.
  • IgnoreCase is true if the RegExp object's [[OriginalFlags]] internal slot contains "i" and otherwise is false.
  • Multiline is true if the RegExp object's [[OriginalFlags]] internal slot contains "m" and otherwise is false.
  • Unicode is true if the RegExp object's [[OriginalFlags]] internal slot contains "u" and otherwise is false.

So Giuseppe Ricupero's answer applies for new browsers, Node, React, etc.

发布评论

评论列表(0)

  1. 暂无评论