最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

php - How to get value from external website while the value is set by javascript in that site - Stack Overflow

programmeradmin0浏览0评论

I used to load external website contents (html/json) with JQuery. I also get a proxy PHP page to work with some sites with strict origin policy.

My problem is for some site: In their HTML sourcecode: <span id='target'>0.00</span>

When browse this external page in Chrome: SPAN value is set to X

I tried direct JQuery $.get/$.ajax and PHP proxy, all value returned is 0.

Is there any way to get the final value X? I understand it's quite impossible since it's hard to emulate the browser action to run remote javascript.

I can test anything in my server so share here if you know any feasible method. Thanks!

I used to load external website contents (html/json) with JQuery. I also get a proxy PHP page to work with some sites with strict origin policy.

My problem is for some site: In their HTML sourcecode: <span id='target'>0.00</span>

When browse this external page in Chrome: SPAN value is set to X

I tried direct JQuery $.get/$.ajax and PHP proxy, all value returned is 0.

Is there any way to get the final value X? I understand it's quite impossible since it's hard to emulate the browser action to run remote javascript.

I can test anything in my server so share here if you know any feasible method. Thanks!

Share Improve this question asked Dec 31, 2013 at 16:54 wing_hkwing_hk 793 silver badges7 bronze badges 3
  • In your example, does the value get set to X by e.g. an onload script? As you know, JS happens "in the browser"; so you need to "let the browser do its thing", then look at the "final" contents. That is not the HTML as it is served to you - but as it is processed. I'm just thinking out loud about your problem here… did I understand correctly? – Floris Commented Dec 31, 2013 at 16:58
  • There is an earlier answer here that appears to ask the same question; but on closer inspection is different. Maybe the links there are some help. Or maybe using a webBrowserControl is the solution... – Floris Commented Dec 31, 2013 at 17:01
  • Do you intend to use this external website content in another web page? Could you make a very small example (maybe with two pages on your own site, one referring to the other) that show how you are currently trying to assimilate the data - with the expected and actual results? – Floris Commented Dec 31, 2013 at 17:05
Add a ment  | 

2 Answers 2

Reset to default 4

I can think of two options. One is good, sensible, fast, etc. The other is stupid and a really bad idea, but was fun toying with. Your two options are:

  • Use Phantom.js
  • Fetch the source code of the external site via your PHP script, load it into your page and use jQuery to find the value.

The first option is the correct and sensible one. Phantom.js boots up a headless Webkit browser, loads, runs, and then parses the page and makes it available to you. There's a PHP wrapper too, so you can do it from PHP quickly

Alternatively, you could do something like this:

Using jQuery / jQuery via PHP, you can fetch the source code of the live website, embed it in the current page and then extract the value using JavaScript. If you don't have the appropriate Access-Control-Allow-Origin headers on the external site, you won't be able to do it via JavaScript, so you'll have to route it through a PHP script on your own domain.

I've done a quick JSFiddle that will grab the JSFiddle Google Verification <meta> tag from the page, using the technique I described above. It's here: http://jsfiddle/USPVJ/1/.

I have to strongly discourage doing this. By doing it you're bypassing any of the Same-origin restrictions - which exist for a very good reason - and are injecting foreign code into your website. Make sure you know the content well before you do any crazy stuff like this.

It appears that the following simple pair of files work together "nicely". Let me know if this is not what you were trying to do…

File 1: http://www.floris.us/SO/getFrom.html

<html>
<head>
<script type="text/javascript">
  function changeItem() {
    document.getElementById("one").innerHTML = "1";
    }
</script>
</head>
<body onload='changeItem();'>
<div id="one">0</div>
</body>
</html>

It starts out with the number 0 in the body of the HTML, then onload changes it to a 1.

File 2: http://www.floris.us/SO/insertHere.php

<html>
<body>
This is the HTML from the other source:<br><br>
<?php
  $text = file_get_contents("http://www.floris.us/SO/getFrom.html");
  echo $text;
  ?>  
<br>Did you see a 0 or a 1?<br>
</body>
</html>

When I load this second script, I do indeed see

This is the HTML from the other source:

1

Did you see a 0 or a 1?

It seems, then, that the javascript ran OK. The actual page source for the final page (as loaded from insertHere.php:

<html>
<body>
This is the HTML from the other source:<br><br>
<html>
<head>
<script type="text/javascript">
  function changeItem() {
    document.getElementById("one").innerHTML = "1";
    }
</script>
</head>
<body onload='changeItem();'>
<div id="one">0</div>
</body>
</html>  
<br>Did you see a 0 or a 1?<br>
</body>
</html>

There appear to be two lots of <html> tags which is ugly…

update when I try to extract the value from the div by changing the second file to

<html>
<head>
<script type="text/javascript">
  function whatIsIt() {
    document.getElementById("here").innerHTML = document.getElementById("one").innerHTML;
  }
</script>
</head>
<body onload="whatIsIt();">
This is the HTML from the other source:<br><br>
<?php
  $text = file_get_contents("http://www.floris.us/SO/getFrom.html");
  echo $text;
  ?>  
 <br>Did you see a 0 or a 1?<br>
 I extracted a value of <div id="here"></div>

   </body>
 </html>

I get a value of 0… because the onload functions are running in the wrong order. Perhaps this is the issue you are running into?

Please don't hesitate to leave a ment if I misunderstood your intent.

发布评论

评论列表(0)

  1. 暂无评论