I trying to get my head around how to fetch Google search results with PHP or JavaScript. I know it has been possible before but now I can't find a way.
I am trying to duplicate (somewhat) the functionality of
/
But really the core issue I want to solve is just to get the search result via PHP or JavaScript,the rest i can figure out.
Fetching the results using file_get_contents() or cURL doesn't seem to work.
Example:
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, ';q=dogs');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$result = curl_exec($ch);
curl_close($ch);
echo '<pre>';
var_dump($result);
echo '</pre>';
Results:
string(219) "302 Moved The document has moved here."
So, with some Googling i found .html but that seems to only work for generating a custom search for one or more websites. It seem to require a "Custom Search Engine" cx-parameter passed.
So anyway, any idea?
I trying to get my head around how to fetch Google search results with PHP or JavaScript. I know it has been possible before but now I can't find a way.
I am trying to duplicate (somewhat) the functionality of
http://www.getupdated.se/sokmotoroptimering/seo-verktyg/kolla-ranking/
But really the core issue I want to solve is just to get the search result via PHP or JavaScript,the rest i can figure out.
Fetching the results using file_get_contents() or cURL doesn't seem to work.
Example:
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, 'http://www.google.se/#hl=sv&q=dogs');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$result = curl_exec($ch);
curl_close($ch);
echo '<pre>';
var_dump($result);
echo '</pre>';
Results:
string(219) "302 Moved The document has moved here."
So, with some Googling i found http://code.google./apis/customsearch/v1/overview.html but that seems to only work for generating a custom search for one or more websites. It seem to require a "Custom Search Engine" cx-parameter passed.
So anyway, any idea?
Share edited Jan 27, 2013 at 20:52 Mirko Adari 5,1031 gold badge17 silver badges23 bronze badges asked Jan 27, 2013 at 20:43 lejahmielejahmie 18.3k16 gold badges57 silver badges79 bronze badges 10-
1
You will need
CURLOPT_FOLLOWLOCATION
flag set to true – hank Commented Jan 27, 2013 at 20:47 - @zerkms Any other suggestions? Is there any legal ways? I really want to do this in a legal way, but I haven't found one. That was part of my question. – lejahmie Commented Jan 28, 2013 at 11:17
- @jamietelin: the first result in google by request "google search API" developers.google./web-search/docs Don't be so lazy – zerkms Commented Jan 28, 2013 at 20:02
-
@zerkms I am not lazy, I have read that, from start to finish + the customs search api. But you haven't, obviously :)
Note: The Google Web Search API has been officially deprecated as of November 1, 2010. It will continue to work as per our deprecation policy, but the number of requests you may make per day will be limited. Therefore, we encourage you to move to the new Custom Search API.
– lejahmie Commented Feb 2, 2013 at 22:23 - 2 @zerkms Where are you trying to get with this? Are you winning? – lejahmie Commented Feb 4, 2013 at 20:01
3 Answers
Reset to default 8I did it earlier. Generate the html contents by making https://www.google.co.in/search?hl=en&output=search&q=india
http request, now parse specific tags using the htmldom php library. You can parse the content of result page using PHP SIMPLE HTML DOM
DEMO : Below code will give you title of all the result :
<?php
include("simple_html_dom.php");
$html = file_get_html('http://www.google.co.in/search?hl=en&output=search&q=india');
$i = 0;
foreach($html->find('li[class=g]') as $element) {
foreach($element->find('h3[class=r]') as $h3)
{
$title[$i] = '<h1>'.$h3->plaintext.'</h1>' ;
}
$i++;
}
print_r($title);
?>
There is php a github package named google-url that does the job.
The api is very fortable to use. See the example :
// this line creates a new crawler
$googleUrl=new \GoogleURL\GoogleUrl();
$googleUrl->setLang('en'); // say for which lang you want to search (it could have been "fr" instead)
$googleUrl->setNumberResults(10); // how many results you want to check
// launch the search for a specific keyword
$results = $googleUrl->search("google crawler");
// finaly you can loop on the results (an example is also available on the github page)
However you will have to think to use a delay between each query, or else google will consider you as a bot and ask you for a captcha that will lock the script.
Odd. Because if I do a curl
from the mand like I get a 200 OK
:
curl -I 'http://www.google.se/#hl=sv&q=dogs'
HTTP/1.1 200 OK
Date: Sun, 27 Jan 2013 20:45:02 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=b82cb66e9d996c48:FF=0:TM=1359319502:LM=1359319502:S=D-LW-_w8GlMfw-lX; expires=Tue, 27-Jan-2015 20:45:02 GMT; path=/; domain=.google.se
Set-Cookie: NID=67=XtW2l43TDBuOaOnhWkQ-AeRbpZOiA-UYEcs7BIgfGs41FkHlEegssgllBRmfhgQDwubG3JB0s5691OLHpNmLSNmJrKHKGZuwxCJYv1qnaBPtzitRECdLAIL0oQ0DSkrx; expires=Mon, 29-Jul-2013 20:45:02 GMT; path=/; domain=.google.se; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google./support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked
Also, maybe consider setting a urlencode
for the passed URL so this line:
curl_setopt($ch, CURLOPT_URL, 'http://www.google.se/#hl=sv&q=dogs');
Changes to this:
curl_setopt($ch, CURLOPT_URL, 'http://www.google.se/' . urlencode('#hl=sv&q=dogs'));