Is it possible to download the entire HTML
of a webpage using JavaScript
given the URL? What I want to do is to develop a Firefox add-on to download the content of all the links found in the source of current page of browser.
update: the URLs reside in the same domain
Is it possible to download the entire HTML
of a webpage using JavaScript
given the URL? What I want to do is to develop a Firefox add-on to download the content of all the links found in the source of current page of browser.
update: the URLs reside in the same domain
Share Improve this question edited Jan 2, 2012 at 13:49 Meysam asked Jan 2, 2012 at 13:30 MeysamMeysam 18.2k44 gold badges142 silver badges231 bronze badges 1- 2 IS it similar to downthemall ? – leopic Commented Jan 2, 2012 at 13:36
4 Answers
Reset to default 5It should be possible to do using jQuery ajax. Javascript in a Firefox extension is not subject to the cross-origin restriction. Here are some tips for using jQuery in a Firefox extension:
Add the jQuery library to your extension's chrome/content/ directory.
Load jQuery in the window load event callback rather than including it in your browser overlay XUL. Otherwise it can cause conflicts (e.g. clobbers a user's customized toolbar).
(function(loader){ loader.loadSubScript("chrome://ryebox/content/jquery-1.6.2.min.js"); }) (Components.classes["@mozilla/moz/jssubscript-loader;1"].getService(Components.interfaces.mozIJSSubScriptLoader));
Use "jQuery" instead of "$". I experienced weird behavior when using $ instead of jQuery (a conflict of some kind I suppose)
Use jQuery(content.document) instead of jQuery(document) to access a page's DOM. In a Firefox extension "document" refers to the browser's XUL whereas "content.document" refers to the page's DOM.
I wrote a Firefox extension for getting bookmarks from my friend's bookmark site. It uses jQuery to fetch my bookmarks in a JSON response from his service, then creates a menu of those bookmarks so that I can easily access them. You can browse the source at https://github./erturne/ryebox
You can do XmlHttpRequests (XHR`s) if the bination scheme://domain:port is the same for the page hosting the JavaScript that should fetch the HTML.
Many JS-frameworks gives you easy XHR-support, Jquery, Dojo, etc. Example using DOJO:
function getText() {
dojo.xhrGet({
url: "test/someHtml.html",
load: function(response, ioArgs){
//The repsone is the HTML
return response;
},
error: function(response, ioArgs){
return response;
},
handleAs: "text"
});
}
If you prefer writing your own XMLHttpRequest-handler, take a look here: http://www.w3schools./xml/xml_http.asp
For JavaScript in general, the short answer is no, not unless all pages are within the same domain. JavaScript is limited by the same-origin policy, so for security reasons, you cannot do cross-domain requests like that.
However, as pointed out by Max and erturne in the ments, when JavaScript is written as part of an extension/add-on to the browser, the regular rules about same origin policy and cross-domain requests does not seem to apply - at least not for Firefox and Chrome. Therefor, using JavaScript to download the pages should be possible using a XMLHttpRequest, or using some of the wrapper methods included in your favorite JS-library.
If you like me prefer jQuery, you can have a look at jQuery's .load() method, that loads HTML from a given resource, and inject it into an element that you specify.
Edit: Made some updates to my answer based on the ments about cross-domain requests made by add-ons.
if you only write a text web page downloader with your mind,and you only know html
and javascript
, you can write a downloader name "download.hta" with html
and javascript
to control Msxml2.ServerXMLHTTP.6.0
and FSO