最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How do I get the data of a website as shown in INSPECT ELEMENT and not in VIEW PAGE SOURCE? - Stack Overflow

programmeradmin1浏览0评论

I want to get the INSPECT ELEMENT data of a website. Let's say Truecaller. So that i can get the Name of the person who's mobile number I searched. But whenever i make a python script it gives me the PAGE SOURCE that does not contain the required information.

Kindly help me. I am a beginner so kindly excuse me of any mistake in the question.

I want to get the INSPECT ELEMENT data of a website. Let's say Truecaller. So that i can get the Name of the person who's mobile number I searched. But whenever i make a python script it gives me the PAGE SOURCE that does not contain the required information.

Kindly help me. I am a beginner so kindly excuse me of any mistake in the question.

Share Improve this question asked Mar 13, 2017 at 6:46 Manish DevganManish Devgan 131 silver badge3 bronze badges 2
  • Try this: print requests.get('http://stackoverflow./questions/42757866/how-do-i-get-the-data-of-a-website-as-shown-in-inspect-element-and-not-in-view-p').text() – TheChetan Commented Mar 13, 2017 at 7:05
  • Try this post: [How to get data from inspect element of a webpage using Python] (stackoverflow./questions/25027339/…) – raviriley Commented Mar 13, 2017 at 7:40
Add a ment  | 

3 Answers 3

Reset to default 3

TL;DR: Use Selenium (and PhantomJS)

The view page source will give you the html that was loaded when you made a request for the page (which is most likely what you are getting when you make a request from python.

Since nowadays a lot of pages load things and modify the DOM after the initial html was loaded, you will not get most of the information you want just by looking into that initial response. To get the inspect element information you will need some sort of web browser to actually go to the page, wait for the information you want to load, and then use it. However you still want to do this in your python script.

Enter selenium, which is a tool for browser automation (mostly used for testing webpages). You can create a python script that opens a browser page and executes whatever code you write for it to do (even wait for a while and search for an after load DOM element!). Your script will still open a browser (which is kind of weird I would guess).

Enter PhantomJS, another library that you can use to have a headless browser to do all your web testing without having to rely on the actual browser UI.

Using selenium only you might achieve your goals, but with phantomjs you can do that in an even cleaner way! Good Luck.

INSPECT ELEMENT and VIEW PAGE SOURCE are not the same.

View source shows you the original HTML source of the page. When you view source from the browser, you get the HTML as it was delivered by the server, not after javascript does its thing.

The inspector shows you the DOM as it was interpreted by the browser. This includes for example changes made by javascript which cannot be seen in the HTML source.

what you see in the element inspector is not the source-code anymore. You see a javascript manipulated version.

Instead of trying to execute all the scripts on your own which may lead into multiple problems like cross origin security and so on,

search the network tab for the actual search request and its parameters. Then request the data from there, that is the trick.

Also it seems like you need to be logged in to search on the url you provided so you need to eventually adapt cookie/session/header and stuff, just like a request from your browser would.

So what i want to say is, better analyse where the data you look for is ing from if it is not in the source

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论