最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - invoking onclick event with beautifulsoup python - Stack Overflow

programmeradmin6浏览0评论

I am trying to fetch the links to all acodations in Cyprus from this website:

So far I can retrieve the first 15 which are already shown. So now I have to invoke the click on the "volgende"-link. However I don't know how to do that and in the source code I am not able to track down the function called to use e.g. sth like posted here: Issues with invoking "on click event" on the html page using beautiful soup in Python

I only need the step where the "clicking" happens so I can fetch the next 15 links and so on.

Does anybody know how to help? Thanks already!

EDIT:

My code looks like this now:

def getZooverLinks(country):
    zooverWeb = "/"
    url = zooverWeb + country
    parsedZooverWeb = parseURL(url)
    driver = webdriver.Firefox()
    driver.get(url)

    button = driver.find_element_by_class_name("next")
    links = []
    for page in xrange(1,3):
        for item in parsedZooverWeb.find_all(attrs={'class': 'blue2'}):
            for link in item.find_all('a'):
                newLink = zooverWeb + link.get('href')
                links.append(newLink)
        button.click()'

and I get the following error:

seleniummon.exceptions.StaleElementReferenceException: Message: Element is no longer attached to the DOM Stacktrace: at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:8956) at Utils.getElementAt (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:8546) at fxdriver.preconditions.visible (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:9585) at DelayedCommand.prototype.checkPreconditions_ (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12257) at DelayedCommand.prototype.executeInternal_/h (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12274) at DelayedCommand.prototype.executeInternal_ (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12279) at DelayedCommand.prototype.execute/< (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12221)

I'm confused :/

I am trying to fetch the links to all acodations in Cyprus from this website: http://www.zoover.nl/cyprus

So far I can retrieve the first 15 which are already shown. So now I have to invoke the click on the "volgende"-link. However I don't know how to do that and in the source code I am not able to track down the function called to use e.g. sth like posted here: Issues with invoking "on click event" on the html page using beautiful soup in Python

I only need the step where the "clicking" happens so I can fetch the next 15 links and so on.

Does anybody know how to help? Thanks already!

EDIT:

My code looks like this now:

def getZooverLinks(country):
    zooverWeb = "http://www.zoover.nl/"
    url = zooverWeb + country
    parsedZooverWeb = parseURL(url)
    driver = webdriver.Firefox()
    driver.get(url)

    button = driver.find_element_by_class_name("next")
    links = []
    for page in xrange(1,3):
        for item in parsedZooverWeb.find_all(attrs={'class': 'blue2'}):
            for link in item.find_all('a'):
                newLink = zooverWeb + link.get('href')
                links.append(newLink)
        button.click()'

and I get the following error:

selenium.mon.exceptions.StaleElementReferenceException: Message: Element is no longer attached to the DOM Stacktrace: at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:8956) at Utils.getElementAt (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:8546) at fxdriver.preconditions.visible (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:9585) at DelayedCommand.prototype.checkPreconditions_ (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12257) at DelayedCommand.prototype.executeInternal_/h (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12274) at DelayedCommand.prototype.executeInternal_ (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12279) at DelayedCommand.prototype.execute/< (file:///var/folders/n4/fhvhqlmx23s8ppxbrxrpws3c0000gn/T/tmpKFL43_/extensions/[email protected]/ponents/mand-processor.js:12221)

I'm confused :/

Share Improve this question edited May 23, 2017 at 12:06 CommunityBot 11 silver badge asked Apr 1, 2015 at 7:29 stephsteph 5652 gold badges6 silver badges22 bronze badges
Add a ment  | 

2 Answers 2

Reset to default 7

While it might be tempting to try to do this using Beautifulsoup's evaluateJavaScript method, in the end Beautifulsoup is a parser rather than an interactive web browsing client.

You should seriously consider solving this with selenium, as briefly shown in this answer. There are pretty good Python bindings available for selenium.

You could just use selenium to find the element and click it, and then pass the page on to Beautifulsoup, and use your existing code to fetch the links.

Alternatively, you could use the Javascript that's listed in the onclick handler. I pulled this from the source: EntityQuery('Ns=pPopularityScore%7c1&No=30&props=15292&dims=530&As=&N=0+3+10500915');. The No parameter increments with 15 for each page, but the props has me guessing. I'd remend not getting into this, though, and just interact with the website as a client would, using selenium. That's much more robust to changes on their side, as well.

I tried the following code and was able to load next page. Hope this will help you too. Code:

from selenium import webdriver
import os
chromedriver = "C:\Users\pappuj\Downloads\chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
url='http://www.zoover.nl/cyprus'
driver.get(url)
driver.find_element_by_class_name('next').click()

Thanks

发布评论

评论列表(0)

  1. 暂无评论