最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Proper way to get XPath lists in Chrome Puppeteer - Stack Overflow

programmeradmin3浏览0评论

I'm using Chrome Puppeteer to get at some content on a Web page. This content is a list of items in a pseudo-table. I'm using XPath to get this content.

When I tested the Xpath expression [in Chrome with the the XPath Helper Extension] it displays the list of text, so I know the XPath expression is fine.

However, I'm having issues trying to do this with Puppeteer. Below is the relevant code [I omitted the opening and closing puppeteer code]:

var xpath_expr_str = "//div[contains(@class,'listings')]/div[4]/p/a";
var page_url_str = 'https://my-url';

await page.goto(page_url_str);
await page.waitForXPath(xpath_expr_str);

var xpath_payload_arr = await page.$x(xpath_expr_str);
var xpath_val_arr = await page.evaluate(function(payload_arr){
    var url_list_arr = [];
    for(var i = 0; i < payload_arr.length; i++)
    {
        url_list_arr.push(payload_arr[i].textContent);
    }
    return url_list_arr;
}, xpath_payload_arr);

console.log(xpath_val_arr);

When I run this, I get the following error:

UnhandledPromiseRejectionWarning: TypeError: Converting circular structure to JSON

I can't seem to get at the list. But, the thing is if I try to just get at a single item in the list, it works ok. For example, the following code works:

var xpath_val_str = await page.evaluate(function(payload_arr){
    return payload_arr.textContent;
}, xpath_payload_arr[0]);
console.log(xpath_val_str);

What's the proper way to manage XPath lists when working with Puppeteer?

I'm using Chrome Puppeteer to get at some content on a Web page. This content is a list of items in a pseudo-table. I'm using XPath to get this content.

When I tested the Xpath expression [in Chrome with the the XPath Helper Extension] it displays the list of text, so I know the XPath expression is fine.

However, I'm having issues trying to do this with Puppeteer. Below is the relevant code [I omitted the opening and closing puppeteer code]:

var xpath_expr_str = "//div[contains(@class,'listings')]/div[4]/p/a";
var page_url_str = 'https://my-url';

await page.goto(page_url_str);
await page.waitForXPath(xpath_expr_str);

var xpath_payload_arr = await page.$x(xpath_expr_str);
var xpath_val_arr = await page.evaluate(function(payload_arr){
    var url_list_arr = [];
    for(var i = 0; i < payload_arr.length; i++)
    {
        url_list_arr.push(payload_arr[i].textContent);
    }
    return url_list_arr;
}, xpath_payload_arr);

console.log(xpath_val_arr);

When I run this, I get the following error:

UnhandledPromiseRejectionWarning: TypeError: Converting circular structure to JSON

I can't seem to get at the list. But, the thing is if I try to just get at a single item in the list, it works ok. For example, the following code works:

var xpath_val_str = await page.evaluate(function(payload_arr){
    return payload_arr.textContent;
}, xpath_payload_arr[0]);
console.log(xpath_val_str);

What's the proper way to manage XPath lists when working with Puppeteer?

Share Improve this question asked Jun 13, 2018 at 18:59 ObiHillObiHill 11.9k24 gold badges92 silver badges142 bronze badges
Add a ment  | 

1 Answer 1

Reset to default 7

Unfortunately you cannot pass xpath_payload_arr into page.evaluate because it's a plex object that obviously contains somewhere a reference to itself. More on "Converting circular structure to JSON" error

However we can iterate over it in node context and page.evaluate items one by one:

var xpath_expr_str = '//*[@id="questions"]/div/div/h3/a';
var page_url_str = 'https://stackoverflow./questions/tagged/puppeteer';

await page.goto(page_url_str);
await page.waitForXPath(xpath_expr_str);

var xpath_payload_arr = await page.$x(xpath_expr_str);

var url_list_arr = [];
for(var i = 0; i < xpath_payload_arr.length; i++)
{
    url_list_arr.push(await page.evaluate(el => el.textContent, xpath_payload_arr[i]));
}

console.log(url_list_arr);

This produces the expected result:

发布评论

评论列表(0)

  1. 暂无评论