最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Detect all images with Javascript in an html page - Stack Overflow

programmeradmin1浏览0评论

I am writing a chrome extension and I am trying to detect all images in a webpage.

I am trying in my JS code to detect all images on a webpage, and by all I mean:

  1. Images that are loaded once the webpage is loaded
  2. Images that are used as background (either in the CSS or inline html)
  3. Images that could be loaded after the webpage is done loading, for instance, when doing a google image search it is easy to find all images, but once you click on one image to make it bigger, this image is not detected. Same thing for browsing social media website.

The code that I have right now makes it easy to find the initial images (1). But I struggle with the other two parts (2) and (3).

Here is my current code in contentScript.js:

var images = document.getElementsByTagName('img');
for (var i = 0, l = images.length; i < l; i++) {
    //Do something
}

How should I modify it so that it actually can detect all other images (2 and 3).

I have seen a couple of questions on (2) on SO like this one or this one, but none of the answers seem to pletely satisfy my second requirement and none of them is about the third.

I am writing a chrome extension and I am trying to detect all images in a webpage.

I am trying in my JS code to detect all images on a webpage, and by all I mean:

  1. Images that are loaded once the webpage is loaded
  2. Images that are used as background (either in the CSS or inline html)
  3. Images that could be loaded after the webpage is done loading, for instance, when doing a google image search it is easy to find all images, but once you click on one image to make it bigger, this image is not detected. Same thing for browsing social media website.

The code that I have right now makes it easy to find the initial images (1). But I struggle with the other two parts (2) and (3).

Here is my current code in contentScript.js:

var images = document.getElementsByTagName('img');
for (var i = 0, l = images.length; i < l; i++) {
    //Do something
}

How should I modify it so that it actually can detect all other images (2 and 3).

I have seen a couple of questions on (2) on SO like this one or this one, but none of the answers seem to pletely satisfy my second requirement and none of them is about the third.

Share Improve this question edited Oct 11, 2018 at 8:18 LBes asked Sep 21, 2018 at 13:26 LBesLBes 3,4562 gold badges37 silver badges71 bronze badges 4
  • Regarding Point 3. Couldn't you just do a setInterval() and check if any new images are in the DOM? – filip Commented Sep 21, 2018 at 13:31
  • @filip seems quite putationally heavy (especially if you want new images to be detected right away, which is one requirement I have). I was thinking more of something like catching events. Isn't there any event that I could use to know that something has been added to the DOM and just check what that something contains to see if there is an image? – LBes Commented Sep 21, 2018 at 13:33
  • 1 Found something called MutationObserver, which checks for changes in the DOM (for example adding an <img> tag) developer.mozilla/en-US/docs/Web/API/MutationObserver @LBes – filip Commented Sep 21, 2018 at 13:38
  • 1 Interesting @filip will give this a try when I'm back from work – LBes Commented Sep 21, 2018 at 13:41
Add a ment  | 

3 Answers 3

Reset to default 5 +50

Live collection of imgs

To find all HTML images, as @vsync has said, is as simple as var images = document.images. This will be a live list so any images that are dynamically added or removed from the page will be automatically reflected in the list.

Extracting background images (inline and CSS)

There are a few ways to check for background images, but perhaps the most reliable way is to iterate over all the page's elements and use window.getComputedStyle to check if each element's backgroundImage does not equal none. This will get background images set both inline and in CSS.

var images = [];
var elements = document.body.getElementsByTagName("*");
Array.prototype.forEach.call( elements, function ( el ) {
    var style = window.getComputedStyle( el, false );
    if ( style.backgroundImage != "none" ) {
        images.push( style.backgroundImage.slice( 4, -1 ).replace(/['"]/g, "")
    }
}

Getting the background image from window.getComputedStyle will return the full CSS background-image property, in the form url(...) so you will need to remove the url( and ). You'll also need to remove any " or ' surrounding the URL. You might acplish this using backgroundImage.slice( 4, -1 ).replace(/['"]/g, "")

Only start checking once the DOM is ready, otherwise your initial scan might miss elements.

Dynamically added background images

This will not provide a live list, so you will need a MutationObserver to watch the document, and check any changed elements for the presence of backgroundImage.

When configuring your observer, make sure your MutationObserver config has childList and subtree set to true. This means it can watch all children of the specified element (in your case the body).

var body = document.body;
var callback = function( mutationsList, observer ){
    for( var mutation of mutationsList ) {
        if ( mutation.type == 'childList' ) {
            // all changed children are in mutation.target.children
            // so iterate over them as in the code sample above
        }
    }
}
var observer = new MutationObserver( callback );
var config = { characterData: true,
            attributes: false,
            childList: true,
            subtree: true };
observer.observe( body, config );

Since searching for background images requires you to check every element in the DOM, you might as well check for <img>s at the same time, rather than using document.images.

Code

You would want to modify the code above so that, in addition to checking if it has a background image, you would check if its tag name is IMG. You should also put it in a function that runs when the DOM is ready.

UPDATE: To differentiate between images and background images, you could push them to different arrays, for example to images and bg_images. To also identify the parents of images, you would push the image.parentNode to a third array, eg image_parents.

var images = [],
    bg_images = [],
    image_parents = [];
document.addEventListener('DOMContentLoaded', function () {
    var body = document.body;
    var elements = document.body.getElementsByTagName("*");

    /* When the DOM is ready find all the images and background images
        initially loaded */
    Array.prototype.forEach.call( elements, function ( el ) {
        var style = window.getComputedStyle( el, false );
        if ( el.tagName === "IMG" ) {
            images.push( el.src ); // save image src
            image_parents.push( el.parentNode ); // save image parent

        } else if ( style.backgroundImage != "none" ) {
            bg_images.push( style.backgroundImage.slice( 4, -1 ).replace(/['"]/g, "") // save background image url
        }
    }

    /* MutationObserver callback to add images when the body changes */
    var callback = function( mutationsList, observer ){
        for( var mutation of mutationsList ) {
            if ( mutation.type == 'childList' ) {
                Array.prototype.forEach.call( mutation.target.children, function ( child ) {
                    var style = child.currentStyle || window.getComputedStyle(child, false);
                    if ( child.tagName === "IMG" ) {
                        images.push( child.src ); // save image src
                        image_parents.push( child.parentNode ); // save image parent
                    } else if ( style.backgroundImage != "none" ) {
                        bg_images.push( style.backgroundImage.slice( 4, -1 ).replace(/['"]/g, "") // save background image url
                    }
                } );
            }
        }
    }
    var observer = new MutationObserver( callback );
    var config = { characterData: true,
                attributes: false,
                childList: true,
                subtree: true };

    observer.observe( body, config );
});

For HTML images (which already exist by the time you run this):

document.images

For CSS images:

You would need to probably use REGEX on the page's CSS (either inline or external files), but this is tricky because you would need to dynamically build the full path out of the relative paths and that might not always work.

Getting all css used in html file


For delayed-loaded images:

You can use a mutation observer, like @filip has suggest in his answer

This should solve your 3. problem. I used a MutationObserver.

I check the targetNode for changes and add a callback, if a change happens.

For your case the targetNode should be the root element to check changes in the whole document.

In the callback I ask if the mutation has added a Node or not with the "IMG" tag.

    const targetNode = document.getElementById("root");

    // Options for the observer (which mutations to observe)
    let config = { attributes: true, childList: true, subtree: true };

    // Callback function to execute when mutations are observed
    const callback = function(mutationsList, observer) {
        for(let mutation of mutationsList) {
            if (mutation.addedNodes[0].tagName==="IMG") {
                console.log("New Image added in DOM!");
            }   
        }
    };

    // Create an observer instance linked to the callback function
    const observer = new MutationObserver(callback);

    // Start observing the target node for configured mutations
    observer.observe(targetNode, config);
发布评论

评论列表(0)

  1. 暂无评论