最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How to parse an html document into an AST that includes line numbers for each node? - Stack Overflow

programmeradmin0浏览0评论

I'd like to use JavaScript to parse an html document into an abstract syntax tree, where each node also includes start and end line numbers (and hopefully also character positions) for each node. Are there any existing solutions that can do this? I don't want to have to write it myself.

Edit Apr 24, 2016: Being able to parse HTML along with php tags in arbitrary places would be even more ideal.

I'd like to use JavaScript to parse an html document into an abstract syntax tree, where each node also includes start and end line numbers (and hopefully also character positions) for each node. Are there any existing solutions that can do this? I don't want to have to write it myself.

Edit Apr 24, 2016: Being able to parse HTML along with php tags in arbitrary places would be even more ideal.

Share Improve this question edited Apr 27, 2016 at 2:28 EricP asked Oct 13, 2014 at 21:23 EricPEricP 3,4393 gold badges36 silver badges47 bronze badges 3
  • Did you find something? – Ivan Bacher Commented Mar 16, 2015 at 17:23
  • Nope. For now I'm just using some regexes that handle most cases until I have time to return and write a real parser myself. – EricP Commented Mar 16, 2015 at 21:05
  • 1 oh ok. I found this: github./HenrikJoreteg/html-parse-stringify, its missing line numbers though so I might try and add that if I have time – Ivan Bacher Commented Mar 18, 2015 at 10:15
Add a ment  | 

2 Answers 2

Reset to default 6

https://unifiedjs.github.io/ can get you the CST or AST for a few formats including HTML.

I used node-html-parser. It's working like a charm! Accessing character position easily by 'range' attribute

const scripts = parse(code).getElementsByTagName('script')
const pureCode = code.slice(scripts[0].range[0], scripts[0].range[1]);
发布评论

评论列表(0)

  1. 暂无评论