最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How to retrieve reCAPTCHA token from iframe with Puppeteer? - Stack Overflow

programmeradmin1浏览0评论

I'm writing a bot with which I am trying to scrape a reCAPTCHA token after a task has been pleted. I'm trying to use:

await page.evaluate(() => document.getElementById('recaptcha-token').value)

after the captcha has loaded onto the page, however each time I've been getting the same error: Uncaught (in promise) Error: Evaluation failed: TypeError: Cannot read property 'value' of null.

I believe that this error is in part caused by the fact that the element that I'm trying to fetch is of type hidden:

<input type="hidden" id="recaptcha-token value="[very long string of letters and numbers]">

How would I go about bypassing this?

I'm writing a bot with which I am trying to scrape a reCAPTCHA token after a task has been pleted. I'm trying to use:

await page.evaluate(() => document.getElementById('recaptcha-token').value)

after the captcha has loaded onto the page, however each time I've been getting the same error: Uncaught (in promise) Error: Evaluation failed: TypeError: Cannot read property 'value' of null.

I believe that this error is in part caused by the fact that the element that I'm trying to fetch is of type hidden:

<input type="hidden" id="recaptcha-token value="[very long string of letters and numbers]">

How would I go about bypassing this?

Share Improve this question edited Jun 14, 2020 at 16:27 theDavidBarton 8,9014 gold badges32 silver badges56 bronze badges asked Jun 14, 2020 at 10:41 Devon CaronDevon Caron 211 silver badge2 bronze badges
Add a ment  | 

1 Answer 1

Reset to default 6

Firstly I really remend you to read Thomas Dondorf's answer on Puppeteer + reCAPTCHA topic.

If you are still willing to go this way then read my answer below:


The fact the <input> is type=hidden is not affecting how puppeteer interacts with the element as it is already in the DOM. You can even test it from Chrome DevTools Console tab by running $('#recaptcha-token').value: you will get its value without any problems. In fact the problem lies somewhere else.

Your are facing currently two issues:

1.) The reCAPTCHA is inside an iframe you need to step inside to let Puppeteer interact with the desired element. To achieve this you will need to grab the exact iframe by its element handle, then using contentFrame() to switch from "browser" to "frame" context.

2.) You will also need the following security disabling args to launch puppeteer: args: ['--disable-web-security', '--disable-features=IsolateOrigins,site-per-process'] because due to the same-origin policy you are not allowed to go inside the iframe by default.

Example reCAPTCHA page: https://patrickhlauke.github.io/recaptcha/

Example script:

const puppeteer = require('puppeteer')

async function getToken() {
  const browser = await puppeteer.launch({
    headless: false,
    args: ['--disable-web-security', '--disable-features=IsolateOrigins,site-per-process']
  })
  const page = await browser.newPage()

  try {
    await page.goto('https://patrickhlauke.github.io/recaptcha/')

    await page.waitForSelector('.g-recaptcha > div > div > iframe')
    const elementHandle = await page.$('.g-recaptcha > div > div > iframe')
    const frame = await elementHandle.contentFrame()
    const value = await frame.evaluate(() => document.getElementById('recaptcha-token').value)
    console.log(value)
  } catch (e) {
    console.error(e)
  }

  await browser.close()
}
getToken()
发布评论

评论列表(0)

  1. 暂无评论