最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

How can I wait for a JavaScript function to evaluation when scraping a website using PuppeteerNode.js? - Stack Overflow

programmeradmin0浏览0评论

I have a website where I want to manipulate certain user inputs, generate a report by clicking a button, and download the resulting report by clicking on button.

I recorded the user behaviour (manipulation) with Chrome DevTools Recorder, exported the Puppeteer script and adapted it a bit. Everything works fine, until the report is generated (some JavaScript function is evaluated in the back). Is there any chance to wait for this evaluation to be complete before the script tries to click something which is not yet available?

Here is the code up until I got an error (I redacted the URL for now):

        console.log('Navigating to the target page...');
        await page.goto('', { waitUntil: 'networkidle2' });

        console.log('Clicking language selector...');
        await page.waitForSelector('div.active-lang', { visible: true });
        await page.click('div.active-lang');

        console.log('Switching to English...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('li.en-gb > a')
        ]);

        console.log('Opening report...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('#treeMenu\\:0_4_1_3 a')
        ]);

        console.log('Selecting year...');
        await page.waitForSelector('#input > table:nth-of-type(1) span', { visible: true });
        await page.click('#input > table:nth-of-type(1) span');
        await page.waitForSelector('#yearBeg_0', { visible: true });
        await page.click('#yearBeg_0');

        console.log('Generating report...');
        await page.waitForSelector('td.action_c2 span', { visible: true });
        await page.click('td.action_c2 span', { timeout: 30000, waitUntil: 'networkidle2' });

        console.log('Exporting file...');
        await page.waitForSelector('td.action_c1 span.ui-button-text', { visible: true });
        await page.click('td.action_c1 span.ui-button-text', { timeout: 60000, waitUntil: 'networkidle2' });
        await page.screenshot({ path: 'debug.png_exporting' });

        console.log('Hovering menu...');
        // Hover over the parent menu to reveal XLS option
        await page.waitForSelector('li.ui-state-hover', { visible: true, timeout: 30000, waitUntil: 'networkidle2' });
        await page.hover('li.ui-state-hover > a', { timeout: 30000, waitUntil: 'networkidle2' });  // Hover to reveal the submenu
        await page.screenshot({ path: 'debug.png_hovering' });

        
        console.log('Selecting XLS format...');
        try {
        
            // Wait for the XLS option and click it by text content
            const xlsMenuItem = await page.waitForFunction(() => {
                const menuItems = Array.from(document.querySelectorAll('li.ui-state-hover span.ui-menuitem-text'));
                return menuItems.find(item => item.textContent.trim() === 'XLS');
            }, { timeout: 30000 });
        
            if (xlsMenuItem) {
                console.log('Clicking on XLS menu item...');
                await xlsMenuItem.click();
            } else {
                throw new Error('XLS menu item not found.');
            }
            console.log('Successfully clicked on XLS format.');
        } catch (error) {
            console.error('Failed to select XLS format:', error);
            throw error;
        }

As you can see from the screenshot provided, the evaluation is still not complete:

and this is how it should look like after evaluation:

this is the output from Node after calling the script:

Clicking language selector...
Switching to English...
Opening report...
Selecting year...
Generating report...
Exporting file...
Hovering menu...
An error occurred: TimeoutError: Waiting for selector `li.ui-state-hover` failed: Waiting failed: 30000ms exceeded
    at new WaitTask (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/WaitTask.js:50:34)
    at IsolatedWorld.waitForFunction (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Realm.js:25:26)
    at CSSQueryHandler.waitFor (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/QueryHandler.js:172:95)
    at async CdpFrame.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Frame.js:522:21)
    at async CdpPage.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Page.js:1304:20)
    at async /home/zenz/ShinyApps/AID/puppeteer/tasks/MD_FDI_IIP_orig.js:98:9
Browser closed.

I tried with varying timeouts, but this doesn't seem to change anything.

I have a website where I want to manipulate certain user inputs, generate a report by clicking a button, and download the resulting report by clicking on button.

I recorded the user behaviour (manipulation) with Chrome DevTools Recorder, exported the Puppeteer script and adapted it a bit. Everything works fine, until the report is generated (some JavaScript function is evaluated in the back). Is there any chance to wait for this evaluation to be complete before the script tries to click something which is not yet available?

Here is the code up until I got an error (I redacted the URL for now):

        console.log('Navigating to the target page...');
        await page.goto('https://website.url', { waitUntil: 'networkidle2' });

        console.log('Clicking language selector...');
        await page.waitForSelector('div.active-lang', { visible: true });
        await page.click('div.active-lang');

        console.log('Switching to English...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('li.en-gb > a')
        ]);

        console.log('Opening report...');
        await Promise.all([
            page.waitForNavigation({ waitUntil: 'networkidle2' }),
            page.click('#treeMenu\\:0_4_1_3 a')
        ]);

        console.log('Selecting year...');
        await page.waitForSelector('#input > table:nth-of-type(1) span', { visible: true });
        await page.click('#input > table:nth-of-type(1) span');
        await page.waitForSelector('#yearBeg_0', { visible: true });
        await page.click('#yearBeg_0');

        console.log('Generating report...');
        await page.waitForSelector('td.action_c2 span', { visible: true });
        await page.click('td.action_c2 span', { timeout: 30000, waitUntil: 'networkidle2' });

        console.log('Exporting file...');
        await page.waitForSelector('td.action_c1 span.ui-button-text', { visible: true });
        await page.click('td.action_c1 span.ui-button-text', { timeout: 60000, waitUntil: 'networkidle2' });
        await page.screenshot({ path: 'debug.png_exporting' });

        console.log('Hovering menu...');
        // Hover over the parent menu to reveal XLS option
        await page.waitForSelector('li.ui-state-hover', { visible: true, timeout: 30000, waitUntil: 'networkidle2' });
        await page.hover('li.ui-state-hover > a', { timeout: 30000, waitUntil: 'networkidle2' });  // Hover to reveal the submenu
        await page.screenshot({ path: 'debug.png_hovering' });

        
        console.log('Selecting XLS format...');
        try {
        
            // Wait for the XLS option and click it by text content
            const xlsMenuItem = await page.waitForFunction(() => {
                const menuItems = Array.from(document.querySelectorAll('li.ui-state-hover span.ui-menuitem-text'));
                return menuItems.find(item => item.textContent.trim() === 'XLS');
            }, { timeout: 30000 });
        
            if (xlsMenuItem) {
                console.log('Clicking on XLS menu item...');
                await xlsMenuItem.click();
            } else {
                throw new Error('XLS menu item not found.');
            }
            console.log('Successfully clicked on XLS format.');
        } catch (error) {
            console.error('Failed to select XLS format:', error);
            throw error;
        }

As you can see from the screenshot provided, the evaluation is still not complete:

and this is how it should look like after evaluation:

this is the output from Node after calling the script:

Clicking language selector...
Switching to English...
Opening report...
Selecting year...
Generating report...
Exporting file...
Hovering menu...
An error occurred: TimeoutError: Waiting for selector `li.ui-state-hover` failed: Waiting failed: 30000ms exceeded
    at new WaitTask (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/WaitTask.js:50:34)
    at IsolatedWorld.waitForFunction (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Realm.js:25:26)
    at CSSQueryHandler.waitFor (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/QueryHandler.js:172:95)
    at async CdpFrame.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Frame.js:522:21)
    at async CdpPage.waitForSelector (/home/zenz/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/api/Page.js:1304:20)
    at async /home/zenz/ShinyApps/AID/puppeteer/tasks/MD_FDI_IIP_orig.js:98:9
Browser closed.

I tried with varying timeouts, but this doesn't seem to change anything.

Share Improve this question edited Nov 18, 2024 at 12:00 David asked Nov 18, 2024 at 10:45 DavidDavid 608 bronze badges 2
  • 1 li.ui-state-hover sounds like a class that pops up when you hover something, but there is no hovering happening in your code. In your log the error appears after Hovering menu... which is not part of the code you provided. Are you sure you provided the correct code? – Tim Hansen Commented Nov 18, 2024 at 10:51
  • @TimHansen I added the lines you are referring to. these are after the code breaks, that's why I didn't include them in the first place. – David Commented Nov 18, 2024 at 12:00
Add a comment  | 

1 Answer 1

Reset to default 2

Instead of depending on timeouts, you need to look for DOM changes. For eg, when certain data elements are rendered on these mentioned screens, your script should wait for those DOM objects to be created at run time.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论