最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - scrape data- attribute using goutte? - Stack Overflow

programmeradmin2浏览0评论

How to scrape the data- attribute from a <a> link using goutte and laravel?

I want to scrape a tag like so:

<a class="ProfileNav-stat ProfileNav-stat--link u-borderUserColor u-textCenter js-tooltip js-nav u-textUserColor" data-nav="following" href="/rogerhamilton/following" data-original-title="987,358 Following">

within this <a> link I want to then scrape the data-original-title tag.

My code is:

$client = new Client();

//  Hackery to allow HTTPS
$guzzleclient = new \GuzzleHttp\Client([
    'timeout' => 60,
    'verify' => false,
]);

//  Hackery to allow HTTPS
$client->setClient($guzzleclient);
$crawler = $client->request('GET', 'url');


$elements = $crawler->filter('.ProfileNav-stat.ProfileNav-stat--link')->each(function($node){
    $x = $node->filter('data-original-title');
    dd($x);
});

but it doesn't return the correct data.

How to scrape the data- attribute from a <a> link using goutte and laravel?

I want to scrape a tag like so:

<a class="ProfileNav-stat ProfileNav-stat--link u-borderUserColor u-textCenter js-tooltip js-nav u-textUserColor" data-nav="following" href="/rogerhamilton/following" data-original-title="987,358 Following">

within this <a> link I want to then scrape the data-original-title tag.

My code is:

$client = new Client();

//  Hackery to allow HTTPS
$guzzleclient = new \GuzzleHttp\Client([
    'timeout' => 60,
    'verify' => false,
]);

//  Hackery to allow HTTPS
$client->setClient($guzzleclient);
$crawler = $client->request('GET', 'url');


$elements = $crawler->filter('.ProfileNav-stat.ProfileNav-stat--link')->each(function($node){
    $x = $node->filter('data-original-title');
    dd($x);
});

but it doesn't return the correct data.

Share Improve this question asked May 14, 2017 at 11:24 kevinabrahamkevinabraham 1,4274 gold badges31 silver badges56 bronze badges
Add a ment  | 

1 Answer 1

Reset to default 7

For anyone else that es accross this issue. Its as simple as filtering down to the link and then doing something like $node->filter('.classname or #ID')->attr('data-original-title').

发布评论

评论列表(0)

  1. 暂无评论