最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How to implement pre_navigation_hook of PlaywrightCrawler in crawlee py? - Stack Overflow

programmeradmin3浏览0评论

I've been using crawlee for python recently to build reliable web scrapers, but sometimes I needed to execute some code before visiting the page url like:

  • blocking specific resources (ex: imaged, media, etc)
  • passing the created page to playwright-stealth like this: await stealth_async(page)

I searched the Docs for a parameter or a class with a similar functionality & found that PlaywrightCrawler has a param called pre_navigation_hook. However, I couldn't find how to use it or a tutorial on their site demonstrating that.

Please provide an example how to use it or any other way to achieve the points above.

Note: pre_navigation_hook accepts a PlaywrightPreNavCrawlingContext as mentioned in docs not PlaywrightCrawlingContext like almost all other request_handlers if this makes any hint.

发布评论

评论列表(0)

  1. 暂无评论