How to download entire HTML of a webpage using javascript?

Is it possible to download the entire HTML of a webpage using JavaScript given the URL? What I want to do is to develop a Firefox add-on to download the content of all the links found in the source of current page of browser.

update: the URLs reside in the same domain

Share Improve this question edited Jan 2, 2012 at 13:49 asked Jan 2, 2012 at 13:30 Meysam 18.2k44 gold badges142 silver badges231 bronze badges

2 IS it similar to downthemall ? – leopic Commented Jan 2, 2012 at 13:36

Add a ment |

4 Answers 4

Sorted by: Reset to default 5

It should be possible to do using jQuery ajax. Javascript in a Firefox extension is not subject to the cross-origin restriction. Here are some tips for using jQuery in a Firefox extension:

Add the jQuery library to your extension's chrome/content/ directory.

Load jQuery in the window load event callback rather than including it in your browser overlay XUL. Otherwise it can cause conflicts (e.g. clobbers a user's customized toolbar).

(function(loader){ 
loader.loadSubScript("chrome://ryebox/content/jquery-1.6.2.min.js"); })
(Components.classes["@mozilla/moz/jssubscript-loader;1"].getService(Components.interfaces.mozIJSSubScriptLoader));

Use "jQuery" instead of "$". I experienced weird behavior when using $ instead of jQuery (a conflict of some kind I suppose)
Use jQuery(content.document) instead of jQuery(document) to access a page's DOM. In a Firefox extension "document" refers to the browser's XUL whereas "content.document" refers to the page's DOM.

I wrote a Firefox extension for getting bookmarks from my friend's bookmark site. It uses jQuery to fetch my bookmarks in a JSON response from his service, then creates a menu of those bookmarks so that I can easily access them. You can browse the source at https://github./erturne/ryebox

You can do XmlHttpRequests (XHR`s) if the bination scheme://domain:port is the same for the page hosting the JavaScript that should fetch the HTML.

Many JS-frameworks gives you easy XHR-support, Jquery, Dojo, etc. Example using DOJO:

function getText() {
  dojo.xhrGet({
    url: "test/someHtml.html",
        load: function(response, ioArgs){
      //The repsone is the HTML
      return response;
    },
    error: function(response, ioArgs){
      return response;
    },
    handleAs: "text"
  });
}

If you prefer writing your own XMLHttpRequest-handler, take a look here: http://www.w3schools./xml/xml_http.asp

For JavaScript in general, the short answer is no, not unless all pages are within the same domain. JavaScript is limited by the same-origin policy, so for security reasons, you cannot do cross-domain requests like that.

However, as pointed out by Max and erturne in the ments, when JavaScript is written as part of an extension/add-on to the browser, the regular rules about same origin policy and cross-domain requests does not seem to apply - at least not for Firefox and Chrome. Therefor, using JavaScript to download the pages should be possible using a XMLHttpRequest, or using some of the wrapper methods included in your favorite JS-library.

If you like me prefer jQuery, you can have a look at jQuery's .load() method, that loads HTML from a given resource, and inject it into an element that you specify.

Edit: Made some updates to my answer based on the ments about cross-domain requests made by add-ons.

if you only write a text web page downloader with your mind,and you only know html and javascript, you can write a downloader name "download.hta" with html and javascript to control Msxml2.ServerXMLHTTP.6.0 and FSO

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

How to download entire HTML of a webpage using javascript? - Stack Overflow

4 Answers 4

与本文相关的文章

评论列表(0)