最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - What is replaceAll performance secret? [HTML escape] - Stack Overflow

programmeradmin1浏览0评论

I spent some time looking best way to escape html string and found some discussions on that: discussion 1 discussion 2. It leads me to replaceAll function. Then I did performance tests and tried to find solution achieving similar speed with no success :(

Here is my final test case set. I found it on net and expand with my tries (4 cases at bottom) and still can not reach replaceAll() performance.

What is secret witch makes replaceAll() solution so speedy?

Greets!

Code snippets:

String.prototype.replaceAll = function(str1, str2, ignore) 
{
   return this.replace(new RegExp(str1.replace(/([\/\,\!\\\^\$\{\}\[\]\(\)\.\*\+\?\|\<\>\-\&])/g,"\\$&"),(ignore?"gi":"g")),(typeof(str2)=="string")?str2.replace(/\$/g,"$$$$"):str2);
};

credits for qwerty

Fastest case so far:

html.replaceAll('&', '&amp;').replaceAll('"', '&quot;').replaceAll("'", '&#39;').replaceAll('<', '&lt;').replaceAll('>', '&gt;');

I spent some time looking best way to escape html string and found some discussions on that: discussion 1 discussion 2. It leads me to replaceAll function. Then I did performance tests and tried to find solution achieving similar speed with no success :(

Here is my final test case set. I found it on net and expand with my tries (4 cases at bottom) and still can not reach replaceAll() performance.

What is secret witch makes replaceAll() solution so speedy?

Greets!

Code snippets:

String.prototype.replaceAll = function(str1, str2, ignore) 
{
   return this.replace(new RegExp(str1.replace(/([\/\,\!\\\^\$\{\}\[\]\(\)\.\*\+\?\|\<\>\-\&])/g,"\\$&"),(ignore?"gi":"g")),(typeof(str2)=="string")?str2.replace(/\$/g,"$$$$"):str2);
};

credits for qwerty

Fastest case so far:

html.replaceAll('&', '&amp;').replaceAll('"', '&quot;').replaceAll("'", '&#39;').replaceAll('<', '&lt;').replaceAll('>', '&gt;');
Share Improve this question edited May 23, 2017 at 10:27 CommunityBot 11 silver badge asked Jul 3, 2013 at 7:28 SaramSaram 1,5101 gold badge18 silver badges35 bronze badges 11
  • 2 Many built in methods are implemented in native code and pre-optimized (regexes being one), emulating them in Javascript in a speedier way is just plain hard to do. – Joachim Isaksson Commented Jul 3, 2013 at 7:32
  • sure, but why "replace new RegExp" case is so slow. It uses RegExp too. – Saram Commented Jul 3, 2013 at 7:34
  • still replace without regex seems to be faster jsperf./replaceallvssplitjoin – Mr_Green Commented Jul 3, 2013 at 7:54
  • 1 @Mr_Green The multiple replace is wrong, because it only replaces the first occurrence :) – Ja͢ck Commented Jul 3, 2013 at 7:59
  • 3 Always pile your regexes; jsperf./htmlencoderegex/32 – Joachim Isaksson Commented Jul 3, 2013 at 8:21
 |  Show 6 more ments

3 Answers 3

Reset to default 4

Finally i found it! Thanks Jack for pointing me on jsperf specific

I should note that the test results are strange; when .replaceAll() is defined inside Benchmark.prototype.setup it runs twice as fast pared to when it's defined globally (i.e. inside a tag). I'm still not sure why that is, but it definitely must be related to how jsperf itself works.

The answer is:

replaceAll - this reach jsperf limit/bug, caused by special sequence "\\$&", so results was wrong.

pile() - when called with no argument it changes regexp definition to /(?:). I dont know if it is bug or something, but performance result was crappy after it was called.

Here is my result safe tests.

Finally I prepared proper test cases.

The result is, that for HTML escape best way it to use native DOM based solution, like:

document.createElement('div').appendChild(document.createTextNode(html)).parentNode.innerHTML

or if you repeat it many times you can do it with once prepared variables:

//prepare variables
var DOMtext = document.createTextNode("test");
var DOMnative = document.createElement("span");
DOMnative.appendChild(DOMtext);

//main work for each case
function HTMLescape(html){
  DOMtext.nodeValue = html;
  return DOMnative.innerHTML
}

Thank you all for collaboration & posting ments and directions.

jsperf bug description

The String.prototype.replaceAll was defined as followed:

function (str1, str2, ignore) {
  return this.replace(new RegExp(str1.replace(repAll, "\\#{setup}"), (ignore ? "gi" : "g")), (typeof(str2) == "string") ? str2.replace(/\$/g, "$$") : str2);
}

As far as performance goes, I find that the below function is as good as it gets:

String.prototype.htmlEscape = function() {
    var amp_re = /&/g, sq_re = /'/g, quot_re = /"/g, lt_re = /</g, gt_re = />/g;

    return function() {
        return this
          .replace(amp_re, '&amp;')
          .replace(sq_re, '&#39;')
          .replace(quot_re, '&quot;')
          .replace(lt_re, '&lt;')
          .replace(gt_re, '&gt;');
    }
}();

It initializes the regular expressions and returns a closure that actually performs the replacement.

Performance test

I should note that the test results are strange; when .replaceAll() is defined inside Benchmark.prototype.setup it runs twice as fast pared to when it's defined globally (i.e. inside a <script> tag). I'm still not sure why that is, but it definitely must be related to how jsperf itself works.

Using RegExp.pile()

I wanted to avoid using a deprecated function, mostly because this kind of performance should be done automatically by modern browsers. Here's a version with piled expressions:

String.prototype.htmlEscape2 = function() {
    var amp_re = /&/g, sq_re = /'/g, quot_re = /"/g, lt_re = /</g, gt_re = />/g;

    if (RegExp.prototype.pile) {
        amp_re.pile();
        sq_re.pile();
        quot_re.pile();
        lt_re.pile();
        gt_re.pile();
    }

    return function() {
        return this
          .replace(amp_re, '&amp;')
          .replace(sq_re, '&#39;')
          .replace(quot_re, '&quot;')
          .replace(lt_re, '&lt;')
          .replace(gt_re, '&gt;');
    }
}

Doing so blows everything else out of the water!

Performance test

The reason why .pile() gives such a performance boost is because when you pile a global expression, e.g. /a/g it gets converted to /(?:)/ (on Chrome), which renders it useless.

If pilation can't be done, a browser should throw an error instead of silently destroying it.

Actually there are faster ways to do this.

If you could do an inline split and join, you will get a better performance.

//example below
var test = "This is a test string";
var test2 = test.split("a").join("A");

Try this and run the performance test.

发布评论

评论列表(0)

  1. 暂无评论