最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

java - How do I stop Scrapping data at last page? - Stack Overflow

programmeradmin7浏览0评论
       int pageCount = 0;
       while (true) {
           System.out.println("Scraping page: " + ++pageCount);
           scrapeCurrentPage();
           
           List<WebElement> nextButtons = driver.findElements(By.xpath("//a[@class='page-link' and contains(@href, 'page=')]"));
           
           // Ensure next button is found and is enabled
           if (nextButtons.size() > 0) {
               WebElement nextButton = nextButtons.get(0);
               
               // Check if the "Next" button is disabled
               if (nextButton.isEnabled()) {
                   nextButton.click();
                   wait.until(ExpectedConditions.presenceOfElementLocated(By.xpath("//*[@id='entry_212408']//div[@class='row']")));
                   
                   try {
                       Thread.sleep(2000);  // Wait for the page to load
                   } catch (InterruptedException e) {
                       e.printStackTrace();
                   }
               } else {
                   // Exit when "Next" button is disabled (end of pages)
                   System.out.println("Last page reached. No more pages to scrape.");
                   break;
               }
           } else {
               // If no "Next" button is found, we assume we've reached the last page
               System.out.println("No next button found. Assuming last page.");
               break;
           }
       }`


if (nextButton.isEnabled()) {
                   nextButton.click();
                   wait.until(ExpectedConditions.presenceOfElementLocated(By.xpath("//*[@id='entry_212408']//div[@class='row']")));
                   
                   try {
                       Thread.sleep(2000);  // Wait for the page to load
                   } catch (InterruptedException e) {
                       e.printStackTrace();
                   }
               } else {
                   // Exit when "Next" button is disabled (end of pages)
                   System.out.println("Last page reached. No more pages to scrape.");
                   break;

I tried to check if the button is disabled at the last page but the loop keep getting back to the first page even when there is no more page left and there are altogether 5 pages in the site that i am trying to scrape and at the last page i.e. at 5th page the next button disappears.

发布评论

评论列表(0)

  1. 暂无评论