最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - puppeteer: How to wait for pages in SPA's? - Stack Overflow

programmeradmin2浏览0评论

I am trying to navigate through an SPA with puppeteer, the problem I am facing here is that I am unable to wait for the page to load then proceed with my program.

I fill a form and then click submit, depending on the contents of the form, different pages can be loaded so I can't use page.waitFor(Selector) as there can be many different pages depending on the input.

I tried using waitUntil: load, networkidle2, networkidle0, domcontentloaded but all of them trigger before the elements are loaded.

The page I am trying to automate is Link. (If you want to check for yourself, then choose booking reference and fill out random details and press continue.)

After choosing "booking-reference" in the link I fill in the details with puppeteer and then press the continue button, What I cannot figure out is how to wait for the page to be pletely loaded without relying on selectors.

I am trying to navigate through an SPA with puppeteer, the problem I am facing here is that I am unable to wait for the page to load then proceed with my program.

I fill a form and then click submit, depending on the contents of the form, different pages can be loaded so I can't use page.waitFor(Selector) as there can be many different pages depending on the input.

I tried using waitUntil: load, networkidle2, networkidle0, domcontentloaded but all of them trigger before the elements are loaded.

The page I am trying to automate is Link. (If you want to check for yourself, then choose booking reference and fill out random details and press continue.)

After choosing "booking-reference" in the link I fill in the details with puppeteer and then press the continue button, What I cannot figure out is how to wait for the page to be pletely loaded without relying on selectors.

Share Improve this question asked Mar 26, 2018 at 11:12 Nagarjun PrasadNagarjun Prasad 9144 gold badges17 silver badges32 bronze badges 1
  • Does this answer your question? How to listen to history.pushstate with Puppeteer? – ggorlen Commented Oct 31, 2020 at 23:50
Add a ment  | 

3 Answers 3

Reset to default 6

I think you should know what those pages are and use Promise.race with page.waitFor for each page, like this:

const puppeteer = require('puppeteer');

const html = `
<html>
  <body>
    <div id="element"></div>
    <button id="button">load</button>

    <script>
      document.getElementById('button').addEventListener("click", () => {
        document.getElementById('element').innerHTML =
          '<div id="element' + (Math.floor(Math.random() * 3) + 1)  + '"></div>';
      });
    </script>
  </body>
</html>`;

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto(`data:text/html,${html}`);

  await page.click('#button');

  const element = await Promise.race([
    page.waitFor('#element1'),
    page.waitFor('#element2'),
    page.waitFor('#element3')
  ]);

  console.log(await (await element.getProperty('id')).jsonValue());
  await browser.close();
})();

For those looking for a quick answer, here's the main code:

await Promise.all([page.waitForNavigation(), el.click()]);

...where el is a link that points to another page in the SPA and click is a an action that triggers navigation. See below for details.


I agree that waitFor isn't too helpful if you can't rely on page content. Even if you can, in most cases it seems like a less desirable approach than naturally reacting to the navigation. Luckily, page.waitForNavigation does work on SPAs. Here's a minimal, plete example of navigating between pages using a click event on a link (the same should work for a form submission) on a tiny vanilla SPA which uses the history API (index.html below).

index.html:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
  </head>
  <body>
    <script>
      const nav = `<a href="/">Home</a> | <a href="/about">About</a> | 
                   <a href="/contact">Contact</a>`;
      const routes = {
        "/": `<h1>Home</h1>${nav}<p>Wele home!</p>`,
        "/about": `<h1>About</h1>${nav}<p>This is a tiny SPA</p>`,
      };
      const render = path => {
        document.body.innerHTML = routes[path] || `<h1>404</h1>${nav}`;
        document.querySelectorAll('[href^="/"]').forEach(el => 
          el.addEventListener("click", evt => {
            evt.preventDefault();
            const {pathname: path} = new URL(evt.target.href);
            window.history.pushState({path}, path, path);
            render(path);
          })
        );
      };
      window.addEventListener("popstate", e =>
        render(new URL(window.location.href).pathname)
      );
      render("/");
    </script>
  </body>
</html>

index.js:

const puppeteer = require("puppeteer"); // ^21.4.1

let browser;
(async () => {
  browser = await puppeteer.launch({});
  const [page] = await browser.pages();
  const text = s => page.$eval(s, el => el.textContent);

  // navigate to the home page for the SPA and print the contents
  await page.goto("http://localhost:8000");
  console.log(page.url()); // => http://localhost:8000/
  console.log(await text("p")); // => Wele home!

  // navigate to the about page via the link
  const a = await page.waitForSelector("text/About");
  await Promise.all([page.waitForNavigation(), a.click()]);

  // show proof that we're on the about page
  console.log(page.url()); // => http://localhost:8000/about
  console.log(await text("p")); // => This is a tiny SPA
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());

Sample run:

$ python3 -m http.server &
$ node index
http://localhost:8000/
Wele home!
http://localhost:8000/about
This is a tiny SPA

If the await Promise.all([page.waitForNavigation(), el.click()]); pattern seems strange, see this issue thread which explains the gotcha that the intuitive

await page.waitForNavigation(); 
await el.click();

causes a race condition.

The same thing as the Promise.all shown above can be done with:

const navPromise = page.waitForNavigation({timeout: 1000});
await el.click();
await navPromise;

See this related answer for more on navigating SPAs with Puppeteer including hash routers.

A workaround for single page application to wait for navigation and get the response status and data. Whether fetch or XHR is being used to do the Ajax request, the main idea should be the same. The following example demonstrates it with fetch

  async spaClick (selector) {
    const res = await this.eval(selector, el => {
      window.originalFetch = window.originalFetch || window.fetch
      return new Promise(resolve => {
        window.fetch = function (...args) {
          return window.originalFetch.apply(this, args)
            .then(async response => {
              resolve({
                status: response.status,
                data: await response.clone().text()
              })

              return response
            })
        }

        el.click()
      })
    })

    if (!res) throw new Error('spaClick() Navigation triggered before eval resolves!')
    return res
  }
const puppeteer = require('puppeteer');
const url = 'http://www.faalkaart.nl';

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    // await Promise.all([
    //   page.waitForNavigation({ waitUntil: 'networkidle0' }),
    //   page.click('selector-that-triggers-navigation'),
    // ]);
    const response = await spaClick('selector-that-triggers-navigation')
    console.log(response) // {status, data}
    await browser.close();
})();

发布评论

评论列表(0)

  1. 暂无评论