最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Use browser to run custom JavaScript on page (client side) to simulate clicking? How to do? - Stack Overflow

programmeradmin6浏览0评论

I want to automatically grab some content from a page.

I wonder if it is possible:

  1. Run my own written JavaScript on the page after the page is loaded (I use FireFox. I don't have the ability to change content of the page. I just want to run JS on my browser.). The script will use getelementbyid or similar method to get the link to the next page

  2. Run a JavaScript to collect my interested content (some URLs) on that page and store those URLs in a local file

  3. Go to next page (the next page will get really loaded with my browser, but I do not need to intervene at all) and repeat step 1 and step 2, until there is no next page.

The classic way to do this is to write a Perl script using LWP or PHP script using CURL, etc. But that is all server side. I wonder if I can do it client side.

I want to automatically grab some content from a page.

I wonder if it is possible:

  1. Run my own written JavaScript on the page after the page is loaded (I use FireFox. I don't have the ability to change content of the page. I just want to run JS on my browser.). The script will use getelementbyid or similar method to get the link to the next page

  2. Run a JavaScript to collect my interested content (some URLs) on that page and store those URLs in a local file

  3. Go to next page (the next page will get really loaded with my browser, but I do not need to intervene at all) and repeat step 1 and step 2, until there is no next page.

The classic way to do this is to write a Perl script using LWP or PHP script using CURL, etc. But that is all server side. I wonder if I can do it client side.

Share Improve this question edited May 16, 2013 at 14:25 schesis 59.3k28 gold badges154 silver badges163 bronze badges asked Aug 28, 2012 at 22:28 Minghui YuMinghui Yu 1,3632 gold badges20 silver badges27 bronze badges 4
  • 1 There isn't a way to write directly to a file on the client side. It is a security risk so you need to use ajax or a page submit to write files. If you don't own the pages, and if you don't mind re-running the javascript on each page manually (i.e. run your script through firebug) you could do all this but I think it would be time consuming. Sounds like web crawling and I'm pretty certain you would be better off doing it on the server side. – scrappedcola Commented Aug 28, 2012 at 22:36
  • Hi @scrappedcola : I may not express my intent well: I want to have my Javascript running through something like Firebug (but firebug cannot do that. It can only debug JS that es with the page). If it cannot write to local file, then writing to console is also okay. I can copy & paste --- though not exactly what I need, I can live with that too. Thanks,. – Minghui Yu Commented Aug 28, 2012 at 22:41
  • 1 You can run js through Firebug. Open up firebug. Click on Console tab. At the bottom there is a mand line (has red box next to >>> and a red box on right hand side). Click the red box on the right hand side (has white arrow on it). In the box cut and paste you JS code and click Run at the bottom of that box. see: getfirebug./mandline – scrappedcola Commented Aug 28, 2012 at 22:43
  • 1 You don't even need to cut and paste the entire code. You could write a short 1 liner that adds your js file to the page to run as long as you have the page hosted somewhere (using IIS and local host would probably work use your machines fully qualified name). – scrappedcola Commented Aug 28, 2012 at 22:45
Add a ment  | 

2 Answers 2

Reset to default 5

I do something rather similar, actually.

By using GreaseMonkey, you can write a user-script that will interact with the pages however you need. You can get the next page link and scroll things as you like.

You can also store any data locally, within Firefox though some new functions called GM_getValue and GM_setValue.

I take the lazy way out. I just generate a long list of the URLs that I find when navigating the pages. I do a crude "document.write" method and I dump out my list of URLs as a batch file that rules on wget.

At that point I copy-and-paste the batch file then run it.

If you need to run this often enough that it should be automated, there used to be a way to turn GreaseMonkey scripts into Firefox extensions, that have access to more power.

Another option is currently AFAIK, Chrome only. You can collect whatever information you need and build a large file from it, then use the download attribute of a link and e up with a single-click to save things.

Update

I was going to share the full code for that I was doing, but it was so tied to a particular website that it wouldn't have really helped -- so I'll go for a more "general" solution.

Warning, this code typed on the fly and may not be actually correct.

// Define the container
// If you are crawling multiple pages, you'd want to load this from
// localStorage.
var savedLinks = [];

// Walk through the document and build the links.
for (var i = 0; i < document.links.length; i++) {
  var link = document.links[i];

  var data = { 
    url: link.url,
    desc = getText(link)
  };

  savedLinks.push(data);
}

// Here you'd want to save your data via localStorage.


// If not on the last page, find the 'next' button and load the next page
// [load next page here]

// If we *are* on the last page, use document.write to output our list.
// 
// Note: document.write totally destroys the current document.  It really is quite
// an ugly way to do it, but in this case it works.
document.write(JSON.stringify(savedLinks, null, 2));

Selenium/webdriver will let you write a simple java/ruby/php app that will launch Firefox, use its JavaScript engine to interact with the page in the browse.

Or, if the web page does not require JavaScript to make the content you see interested in available, you could use a html parser in your favourite language and leave the browser out of it.

If you want to do it in JavaScript in Firefox you could probably do it in a greasemonkey script

发布评论

评论列表(0)

  1. 暂无评论