最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

regex - javascript regular expression - removing comments - Stack Overflow

programmeradmin6浏览0评论

The example is from eloquent javascript book.Although There's a little explanation in the book it's really hard to follow, can anyone explain it from beginner perspective.I am having hard time to follow which slash is for what.

function stripComments(code) {
  return code.replace(/\/\/.*|\/\*[^]*\*\//g, "");
}

The example is from eloquent javascript book.Although There's a little explanation in the book it's really hard to follow, can anyone explain it from beginner perspective.I am having hard time to follow which slash is for what.

function stripComments(code) {
  return code.replace(/\/\/.*|\/\*[^]*\*\//g, "");
}
Share Improve this question asked Aug 28, 2014 at 12:10 Wahid PolinWahid Polin 10113 bronze badges 1
  • A nice way to try and understand a regular expression is to use something like Debuggex – phuzi Commented Aug 28, 2014 at 12:33
Add a ment  | 

4 Answers 4

Reset to default 7

Comment can have two forms:

// this is a ment
/* this is a ment */

Unfortunately, both / and * are special characters in regular expressions, so they must be escaped.

So we start with an empty match expression

//g

We set it to match the first form, // followed by any number of characters, which would be //.* but the slashes have to be escaped

/\/\/.*/g

The other form, /* followed by anything followed by */ is /*[^]**/ but we have to escape the literal slashes and asterisks

\/\*[^]*\*\/

The two forms are then bined with a | character which denotes a "or":

\/\/.*|\/\*[^]*\*\/

and inserted into the empty regex

/\/\/.*|\/\*[^]*\*\//g

First and last slashes are delimiters.

g at the end is a modifier (Modifiers are used to perform case-insensitive and global searches) and performs a global match (find all matches rather than stopping after the first match).

| means OR.

\/\/.* has some escaped chars and can be translated as // followed by any characters \/\*[^]*\*\/ has also some escaped chars and can be translated as /*any characters*/

Note: both / and * must be escaped because they are used by regex syntax (special characters). So \/ means / and \* means * while .* means any characters (0 or more times)

Since the goal of your code is to remove ments, all ments like // xxxx or /* xxx */ are replaced by empty string

Let's break it down with one token per line:

/    # Start a new regex

# This group of tokens matches ments in the form:
# // this is a ment

\/   # An escaped forward slash
\/   # An escaped forward slash
.*   # Any character, zero or more times

|    # OR. This means "match either the previous or the next group of tokens".

# This group of tokens matches ments in the form:
# /* 
#  This is a ment, which could include some new lines
# */

\/   # An escaped forward dlash
\*   # An escaped asterisk
[^]* # A newline, zero or more times
\*   # An escaped asterisk
\/   # An escaped forward slash

/    # Finish the current regex.
g    # This regex can match multiple times against a given input

/ --> start of regex

/ --> escaped "/" character

/ --> escaped "/" character

.* --> any character (even empty) --> here is the case // abck

| --> OR

/ --> escaped "/" character

* --> escaped "*" character

[^]* --> any character (multiline, so even \n\r)

* --> escaped "*" character

/ --> escaped "/" character --> here is the case /* aasd\nasdasd */

/ --> end of regex

g --> global modifier

发布评论

评论列表(0)

  1. 暂无评论