最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

java - Server side browser that can execute JavaScript - Stack Overflow

programmeradmin3浏览0评论

Is there any programming libraries available that will parse an HTML document, execute JavaScript and then allow me to navigate the DOM?  This needs to be performed server side, not client side. Any language will do, but Java, PHP, or Ruby are preferred.

Is there any programming libraries available that will parse an HTML document, execute JavaScript and then allow me to navigate the DOM?  This needs to be performed server side, not client side. Any language will do, but Java, PHP, or Ruby are preferred.

Share Improve this question asked Jan 26, 2010 at 20:10 Matt SidesingerMatt Sidesinger 2,1441 gold badge22 silver badges18 bronze badges
Add a comment  | 

9 Answers 9

Reset to default 6

Have you tried Bringing the Browser to the Server?

in java: http://lobobrowser.org/cobra/java-html-parser.jsp
this is a a Javascript-aware, CSS-aware HTML parser
the most important feature in relation to your question: It is Javascript-aware. DOM modifications that occur during parsing will be reflected in the resulting DOM.

Java has support for javascript with Rhino, also look at this page for server side javascript solutions: http://en.wikipedia.org/wiki/Server-side_JavaScript

For Java, be sure to check out HtmlUnit and HttpUnit.

PhantomJS does this and can be used with any server side language. See some integration modules below for NodeJS and PHP

NodeJS

https://npmjs.org/package/node-phantom

https://github.com/sgentle/phantomjs-node

PHP

https://github.com/diggin/php-PhantomjsRunner

PHP has DOMDocument for navigating the DOM. I haven't heard of anything for executing JavaScript.

Start from this post and follow a links. Or just search for Rhino.

There are now several projects that do a really good job of this:

  • PhantomJS is a headless version of WebKit, and there are some helpful wrappers such as CasperJS.

  • Zombie.js which is a wrapper over jsdom written in Javascript (Node.js).

You need to write JavaScript code to interact with both of these projects. I like Zombie.js better so far, since it is easier to set up, and you can use any Node.js/npm modules in your code.

node.js ?

Node can run any javascript file in its console. I would try node first & see if it can do what you want as it likely has the largest user base & documentation.

发布评论

评论列表(0)

  1. 暂无评论