最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Get title from newly opened page puppeteer - Stack Overflow

programmeradmin4浏览0评论

I am trying to get the a new tab and scrape the title of that page with puppeteer.

This is what I have

// use puppeteer
const puppeteer = require('puppeteer');

//set wait length in ms: 1000ms = 1sec
const short_wait_ms = 1000

async function run() {
    const browser = await puppeteer.launch({
        headless: false, timeout: 0});
    const page = await browser.newPage();

        await page.goto('/');

    // second page DOM elements
    const CLICKHERE_SELECTOR = '#post-2068 > div > div.entry-content > p:nth-child(2) > a:nth-child(1)';

    // main page
    await page.waitFor(short_wait_ms);
    await page.click(CLICKHERE_SELECTOR);


    // new tab opens - move to new tab
    let pages = await browser.pages();

    //go to the newly opened page

    //console.log title -- Generalized Linear Mixed Models in Ecology and in R

}

run();

I can't figure out how to use browser.page() to start working on the new page.

I am trying to get the a new tab and scrape the title of that page with puppeteer.

This is what I have

// use puppeteer
const puppeteer = require('puppeteer');

//set wait length in ms: 1000ms = 1sec
const short_wait_ms = 1000

async function run() {
    const browser = await puppeteer.launch({
        headless: false, timeout: 0});
    const page = await browser.newPage();

        await page.goto('https://biologyforfun.wordpress.com/2017/04/03/interpreting-random-effects-in-linear-mixed-effect-models/');

    // second page DOM elements
    const CLICKHERE_SELECTOR = '#post-2068 > div > div.entry-content > p:nth-child(2) > a:nth-child(1)';

    // main page
    await page.waitFor(short_wait_ms);
    await page.click(CLICKHERE_SELECTOR);


    // new tab opens - move to new tab
    let pages = await browser.pages();

    //go to the newly opened page

    //console.log title -- Generalized Linear Mixed Models in Ecology and in R

}

run();

I can't figure out how to use browser.page() to start working on the new page.

Share Improve this question asked Nov 16, 2017 at 11:44 AlexAlex 2,7804 gold badges44 silver badges80 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 9

According to the Puppeteer Documentation:

page.title()

  • returns: <Promise<string>> Returns page's title.

Shortcut for page.mainFrame().title().

Therefore, you should use page.title() for getting the title of the newly opened page.

Alternatively, you can gain a slight performance boost by using the following:

page._frameManager._mainFrame.evaluate(() => document.title)

Note: Make sure to use the await operator when calling page.title(), as the title tag must be downloaded before Puppeteer can access its content.

You shouldn't need to move to the new tab.

To get the title of any page you can use:

const pageTitle = await page.title();

Also after you click something and you're waiting for the new page to load you should wait for the load event or the network to be Idle:

// Wait for redirection
await page.waitForNavigation({waitUntil: 'networkidle', networkIdleTimeout: 1000});

Check the docs: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagewaitfornavigationoptions

发布评论

评论列表(0)

  1. 暂无评论