最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How to scrape html and JS without WebView in background when the application is not running in android - Stack Over

programmeradmin4浏览0评论

I want to scrape the data from the webpages and run a script java also using background mechanism like service in Android App. I can do that using WebView but then it needs to run on main thread. I want to authenticate and get the cookies from the web portal and generate cookies on html content. I have tried Jsoup which can parse HTML content only, but not able to inject javascript on html element. Also I want to execute ajax calls.

I know this might not be possible by using any single library. However, is there any approach I can follow by using combination of library for different steps and implement it?

I want to scrape the data from the webpages and run a script java also using background mechanism like service in Android App. I can do that using WebView but then it needs to run on main thread. I want to authenticate and get the cookies from the web portal and generate cookies on html content. I have tried Jsoup which can parse HTML content only, but not able to inject javascript on html element. Also I want to execute ajax calls.

I know this might not be possible by using any single library. However, is there any approach I can follow by using combination of library for different steps and implement it?

Share Improve this question edited Mar 19 at 16:33 Mister Jojo 22.6k6 gold badges25 silver badges44 bronze badges asked Mar 19 at 13:05 user_8275user_8275 2914 silver badges16 bronze badges 1
  • Are you familiar with the term "headless browser"? – ADyson Commented Mar 19 at 15:28
Add a comment  | 

1 Answer 1

Reset to default 1

You can't execute JavaScript or handle AJAX fully with Jsoup alone. Instead, use a headless browser like Selenium (via a remote server) or offload the task to a Node.js backend with Puppeteer/Playwright. For authentication and cookie handling, use OkHttp in combination with a web scraping service. To run it in the background, use WorkManager or a Foreground Service in Android. Running a headless browser directly on Android is impractical, so a backend approach is often the best solution.

发布评论

评论列表(0)

  1. 暂无评论