最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

dom - Using a TreeWalker to retrieve non-Javascript text nodes - Stack Overflow

programmeradmin2浏览0评论

This question teaches how to get all TextNodes inside the document, and this is getting me the Javascript texts as well. What is the best way to filter out all the Nodes that are Javascript code?

This question teaches how to get all TextNodes inside the document, and this is getting me the Javascript texts as well. What is the best way to filter out all the Nodes that are Javascript code?

Share Improve this question asked May 12, 2016 at 5:35 lvellalvella 13.5k13 gold badges61 silver badges121 bronze badges
Add a ment  | 

2 Answers 2

Reset to default 14

Text inside <script> tags has only one thing in mon: their parent is a <script> element.

if (node.parentNode.nodeName !== 'SCRIPT')

Another approach is to use the filter:

var rejectScriptTextFilter = {
  acceptNode: function(node) {
    if (node.parentNode.nodeName !== 'SCRIPT') {
      return NodeFilter.FILTER_ACCEPT;
    }
  }
};

var walker = document.createTreeWalker(
  document.body, 
  NodeFilter.SHOW_TEXT, 
  rejectScriptTextFilter,
  false
);

var node;
var textNodes = [];

while(node = walker.nextNode()) {
  textNodes.push(node.nodeValue);
}

console.log(textNodes);
<script> var str = "script here"; </script>
<p> text here </p>

You could clone the original document, remove <script> elements at cloned document, then iterate remaining nodes of cloned document

发布评论

评论列表(0)

  1. 暂无评论