最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Embedding all the external resources of an HTML page into a single file using javascript in the browser - Stack Overflow

programmeradmin7浏览0评论

As you all know, external resources, like images, can be embedded into the html file using base64 encoding:

<img src="..." />

I'm looking for a pure browser-based javascript way to traverse an html page and embed all the external resources into the file so when I say $("html").html(), it returns all the page's contents. Even including its external resources.

Just so it makes sense, I'm trying to download web pages into single files using a headless browser on my server.

As you all know, external resources, like images, can be embedded into the html file using base64 encoding:

<img src="..." />

I'm looking for a pure browser-based javascript way to traverse an html page and embed all the external resources into the file so when I say $("html").html(), it returns all the page's contents. Even including its external resources.

Just so it makes sense, I'm trying to download web pages into single files using a headless browser on my server.

Share Improve this question edited Oct 28, 2014 at 10:31 Mehran asked Oct 27, 2014 at 19:32 MehranMehran 16.9k31 gold badges139 silver badges245 bronze badges 2
  • If you're using JS, why encode the images? – Mooseman Commented Oct 27, 2014 at 19:34
  • Because JS can easily traverse all the html elements. Otherwise I'll need a parser to read and turn the tags into DOM objects before I can query them for external resources. – Mehran Commented Oct 27, 2014 at 19:37
Add a ment  | 

2 Answers 2

Reset to default 13

There are tools out there to do that. Examples:

  • https://github./remy/inliner
  • https://github./jgallen23/grunt-inline-css
  • https://github./ceee/grunt-datauri

While there are benefits to this approach, remember that a page visited more than once, or site with multiple pages with same JS/CSS files will enjoy client (browser) side caching.

Browser extensions

There are Save Page WE extension for Firefox and Chrome:

  • Firefox: https://addons.mozilla/en-US/firefox/addon/save-page-we/
  • Chrome: https://chrome.google./webstore/detail/save-page-we/dhhpefjklgkmgeafimnjhojgjamoafof/related

This extension can scroll or zoom out the page in order to allow fetching lazy-loading resources before saving.

Command line tools

monolith (rust)

CLI tool for saving plete web pages as a single HTML file

Install

# any platform with rustc installed
cargo install monolith

# on macos
brew install monolith

# on windows
choco install monolith

obelisk (golang)

Go package and CLI tool for saving web page as single HTML file

# any platform with go sdk installed
go install -v github./go-shiori/obelisk/cmd/obelisk@latest

binaries: https://github./go-shiori/obelisk/releases

inliner

inliner is a npm module which exposes the inliner cli utility; works with some URLs but throws errors with others. Pipes output to stdout and therefore needs to be used like e.g. inliner https://http.cat > cats.html.

It can be installed with (assuming you have nodejs+npm):

npm install -g inliner

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论