最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Puppeteer: proper selection of inner text - Stack Overflow

programmeradmin1浏览0评论

I want to grab a string that has a particular class name, lets say 'CL1'.

This is what is used to do and it worked: (we are inside an asycn function)

var counter = await page.evaluate(() => {
            return document.querySelector('.CL1').innerText;
        });

Now, after some months, when i try to run the code i get this error:

Error: Evaluation failed: TypeError: Cannot read property 'innerText' of null

I did some debugging with some console.log() before and after the previous snippet of code and found out that this is the culprit.

I looked the code of the webpage and the particular class is inside.

But i found out two more classes with the same name.

All three of them are nested deep inside many classes.

So what is the proper way to selected the one i want, given i know the class hierarchy for the one i am interested in?

EDIT: Since there are three class names with the same name, and i want to extract info from the first, can i use an array notation on the querySelector() to access the information from the first one?

EDIT2: I run this:

return document.querySelector('.CL1').length;

and i got

Error: Evaluation failed: TypeError: Cannot read property 'length' of null

This gets even more confusing...

EDIT 3: I trie the suggestion of Md Abu Taher and i saw that the snippet of code he provided did not return undefined. This means that the selector is visible to my code.

Then i run this snippet of code:

var counter = await page.evaluate(() => {
            return document.querySelector('#react-root > section > main > div > header > section > ul > li:nth-child(1) > a > span').innerText;
            });

And i got back the same error:

Error: Evaluation failed: TypeError: Cannot read property 'innerText' of null

I want to grab a string that has a particular class name, lets say 'CL1'.

This is what is used to do and it worked: (we are inside an asycn function)

var counter = await page.evaluate(() => {
            return document.querySelector('.CL1').innerText;
        });

Now, after some months, when i try to run the code i get this error:

Error: Evaluation failed: TypeError: Cannot read property 'innerText' of null

I did some debugging with some console.log() before and after the previous snippet of code and found out that this is the culprit.

I looked the code of the webpage and the particular class is inside.

But i found out two more classes with the same name.

All three of them are nested deep inside many classes.

So what is the proper way to selected the one i want, given i know the class hierarchy for the one i am interested in?

EDIT: Since there are three class names with the same name, and i want to extract info from the first, can i use an array notation on the querySelector() to access the information from the first one?

EDIT2: I run this:

return document.querySelector('.CL1').length;

and i got

Error: Evaluation failed: TypeError: Cannot read property 'length' of null

This gets even more confusing...

EDIT 3: I trie the suggestion of Md Abu Taher and i saw that the snippet of code he provided did not return undefined. This means that the selector is visible to my code.

Then i run this snippet of code:

var counter = await page.evaluate(() => {
            return document.querySelector('#react-root > section > main > div > header > section > ul > li:nth-child(1) > a > span').innerText;
            });

And i got back the same error:

Error: Evaluation failed: TypeError: Cannot read property 'innerText' of null
Share Improve this question edited Jun 22, 2019 at 15:38 user1584421 asked Jun 21, 2019 at 14:40 user1584421user1584421 3,89312 gold badges57 silver badges98 bronze badges 7
  • 1 can you provide us with the url of the page you're trying to access? – Krzysztof Krzeszewski Commented Jun 21, 2019 at 14:42
  • 1 Other than making sure that the class name didn't change, verify that you're waiting for the page to load before calling querySelector. – zaquest Commented Jun 21, 2019 at 14:43
  • @KrzysztofKrzeszewski Thank you for the effort, but it is an intranet URL – user1584421 Commented Jun 21, 2019 at 14:49
  • @zaquest Yes, this is handled just fine. – user1584421 Commented Jun 21, 2019 at 14:49
  • are you waiting for the content to be loaded in the dom? – Joey Gough Commented Jun 21, 2019 at 15:18
 |  Show 2 more ments

2 Answers 2

Reset to default 7

The answer is divided in to parts. Getting right selector, and getting data.

1. Getting right Selector

Use inspect element

  • Right click on your desired element and click inspect element.
  • Then right click and click Copy > Copy selector

This will give you a unique selector for that specific element.

Use a selector tool

There are bunch of chrome extension that helps you find the right selector.

  • Selectorgadget
  • Get Unique CSS Selector
  • Copy Css Selector

2. Getting the data

Given your selector is .CL1, you need to do few things.

Wait for all Network events to finish

Basically on a navigation you can wait until network is idle.

await page.goto(url, {waitUntil: 'networkidle2'});

Wait for the element to appear in DOM.

Even if the network is idle, there might be redirect etc. Best choice is to wait until the element appears. The following will wait until the element is found and will throw an error otherwise.

await page.waitFor('.CL1');

Or, Check if element exists and return data only if it exists

If you do not want to throw an error or if the element appears randomly, you need to check it's existence and return data.

await page.evaluate(() => {
  const element = document.querySelector('.CL1');
  return element && element.innerText; // will return undefined if the element is not found
});

try to verify the element before

var x = document.getElementsByClassName("example");

OR

var x = document.getElementsById("example");

and then

var counter = await page.evaluate(() => {
            return x.innerText;
        });
发布评论

评论列表(0)

  1. 暂无评论