I'm using scrapy
and playwright
to scrape booking
in this way I need to click on a button and get ajax
response.
but when I run my code it returns error :
TypeError: Page.locator() missing 1 required positional argument: 'selector'
this is my code:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect
class BookingSpider(scrapy.Spider):
name='booking'
start_urls=[".en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]
def start_requests(self):
yield scrapy.Request(self.start_urls[0], meta={
"playwright": True,
"playwright_include_page":True,
"playwright_page_methods":[
PageMethod("wait_for_selector",".e1793b8db2")
]
})
def parse(self,response):
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
with open("copy.txt", "w", encoding="utf-8") as file:
file.write((response.text))
process=CrawlerProcess()
process.crawl(BookingSpider)
process.start()
error Message:
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
current.result = callback( # type: ignore[misc]
~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
current.result, *args, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
return self.parse(response, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'
I'm using scrapy
and playwright
to scrape booking
in this way I need to click on a button and get ajax
response.
but when I run my code it returns error :
TypeError: Page.locator() missing 1 required positional argument: 'selector'
this is my code:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.selector import Selector
from scrapy.http import HtmlResponse
from scrapy_playwright.page import PageMethod
from playwright.async_api import Page,expect
class BookingSpider(scrapy.Spider):
name='booking'
start_urls=["https://www.booking/hotel/it/hotelnordroma.en-gb.html?aid=304142&checkin=2025-04-15&checkout=2025-04-16#map_opened-map_trigger_header_pin"]
def start_requests(self):
yield scrapy.Request(self.start_urls[0], meta={
"playwright": True,
"playwright_include_page":True,
"playwright_page_methods":[
PageMethod("wait_for_selector",".e1793b8db2")
]
})
def parse(self,response):
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
with open("copy.txt", "w", encoding="utf-8") as file:
file.write((response.text))
process=CrawlerProcess()
process.crawl(BookingSpider)
process.start()
error Message:
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\twisted\internet\defer.py", line 1088, in _runCallbacks
current.result = callback( # type: ignore[misc]
~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
current.result, *args, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "c:\Users\mojsa\AppData\Local\Programs\Python\Python313\Lib\site-packages\scrapy\spiders\__init__.py", line 86, in _parse
return self.parse(response, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "J:\SeSa\booking\booking\spiders\booking.py", line 27, in parse
Page.locator("xpath=//*[@class='e1793b8db2'][1]").click()
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Page.locator() missing 1 required positional argument: 'selector'
Share
Improve this question
edited Feb 25 at 12:17
Mojsa
asked Feb 21 at 17:46
MojsaMojsa
297 bronze badges
3
|
1 Answer
Reset to default -1Issues:
Incorrect start_urls usage in start_requests
start_urls is a class attribute, and in start_requests, you should reference self.start_urls. Incorrect use of Page.locator
Page is not defined in your parse function. You need to extract the page from the meta field in response. Incorrect indentation for CrawlerProcess
process = CrawlerProcess() and related lines should not be inside the class. Missing imports
You need to import scrapy, CrawlerProcess, and PageMethod from playwright.
Page.locator()
needs two arguments but you use only one. Maybe it needsresponse
as second (or first) argument. OR maybe you should useresponse.locator()
instead ofPage.locator()
? – furas Commented Feb 21 at 19:53page = response.meta["playwright_page"]
like in question python - Scrapy and Scrapy-playwright scrape first comment of every page instead of every comment for every page - Stack Overflow. And maybe later use this instancepage
instead of class namePage
– furas Commented Feb 21 at 19:59Page
comes from. – lmtaq Commented Feb 21 at 22:36