最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Javascript Regex for matching whole words - Stack Overflow

programmeradmin2浏览0评论

This is a follow-up question to this one

Since javascript regex is much different from regex (which I'm used to), I can't seem to figure out how to enhance this regex.

Here's the current pattern:

var pattern = new RegExp('\\b' + filter[i] + '\\b', 'g');

This works great when the phrase stands alone but if it's located in an anchor tag, the method ends up removing the entire anchor (which is not desirable).

Example

<body>
    This is my text. It's an ass of a time in class
    <a href="">ass-hole</a>
</body>

shows up as

<body> This is my text. It's an *** of a time in class ***-hole </body>

in the DOM

What I want it to look like is

<body>
    This is my text. It's an *** of a time in class
    <a href="/***-hole">***-hole</a>
</body>

This is a follow-up question to this one

Since javascript regex is much different from regex (which I'm used to), I can't seem to figure out how to enhance this regex.

Here's the current pattern:

var pattern = new RegExp('\\b' + filter[i] + '\\b', 'g');

This works great when the phrase stands alone but if it's located in an anchor tag, the method ends up removing the entire anchor (which is not desirable).

Example

<body>
    This is my text. It's an ass of a time in class
    <a href="http://example./1234/ass-hole">ass-hole</a>
</body>

shows up as

<body> This is my text. It's an *** of a time in class ***-hole </body>

in the DOM

What I want it to look like is

<body>
    This is my text. It's an *** of a time in class
    <a href="http://example./1234/***-hole">***-hole</a>
</body>
Share Improve this question edited May 23, 2017 at 10:26 CommunityBot 11 silver badge asked Apr 26, 2011 at 16:55 Chase FlorellChase Florell 47.5k59 gold badges190 silver badges382 bronze badges 5
  • 1 There's no way that Regex can be used to remove what you claim it removed. – ikegami Commented Apr 26, 2011 at 17:00
  • test it for yourself. jsfiddle/Ld93F – Chase Florell Commented Apr 26, 2011 at 17:02
  • 2 urbandictionary./define.php?term=clbuttic – mcgrailm Commented Apr 26, 2011 at 17:04
  • @mcgrailm I'm not asking if it's a good idea. I'm not searching for an opinion, I'm simply searching for a possible regex solution. – Chase Florell Commented Apr 26, 2011 at 17:07
  • I understand, I just thought I should put that out there – mcgrailm Commented Apr 26, 2011 at 17:10
Add a ment  | 

3 Answers 3

Reset to default 3

It looks like $('body').text(function (i, txt) { ... }); is giving you the inner text of the body element in one big block, with all of the tags already removed. In other words, your regex is not removing tags, but $('body').text is.

It sounds like you actually want to loop over descendant child text nodes of the body. I'm not familiar with jQuery, perhaps it has another function that does this for you, but if it doesn't, you can use this one:

function allTextNodes(parent) {

    function getChildNodes(parent) {
        var x, out = [];
        for (x = 0; x < parent.childNodes.length; x += 1) {
            out[x] = parent.childNodes[x];
        }

        return out;
    }

    var cursor, closed = [], open = getChildNodes(parent);

    while (open.length) {
        cursor = open.shift();
        if (cursor.nodeType === 1) {
            open.unshift.apply(open, getChildNodes(cursor));
        }
        if (cursor.nodeType === 3) {
            closed.push(cursor);
        }
    }

    return closed;
}

Using that function (or one like it), try this usage instead:

(function () {
    var x, i, re, rep,
        nodes = allTextNodes(document.body),
        filter = [ 'some', 'words', 'go', 'here' ];

    for (x = 0; x < nodes.length; x += 1) {
        for (i = 0; i < filter.length; i += 1) {
            re = new RegExp('\\b' + filter[i] + '\\b', 'g');
            rep = '****'; // fix this
            if (re.test(nodes[x].nodeValue)) {
                nodes[x].nodeValue = nodes[x].nodeValue.replace(re, rep);
            }
        }
    }
}());

Food for thought: what will happen if you have a filter word that contains a character that has meaning inside a regex? It seems unlikely in this case, but you should consider it all the same.

There's no way that Regex can be used to remove what you claim it removed. The problem is that the input isn't what you claim it is. If you add

alert(txt);

to your function, you'll see that you're actually passing

This is my text. It's an ass of a time in class ass-hole

to it. This is the body's text. Perhaps you want its innerHTML.

Next time, please post a minimal, runnable demonstration of the problem up front. It's really bad when you say you have a problem doing a substitution, and the code doesn't perform any substitution.

The problem here is because your matching \b on either side with as a word. This means it is required to be surrounded by certain characters, and '>' is not one of them.

So in your code, you need to change your regex to allow for '>' to exist on the left side and probably '<' to exist on the right.

var pattern = new RegExp('(\b | >)' + filter[i] + '(\b | <)', 'g');

Is probably pretty close to what you need.

The real javascript REGEXP can be found here: http://www.javascriptkit./javatutors/redev2.shtml

发布评论

评论列表(0)

  1. 暂无评论