I have a secure link direction service I'm running (expiringlinks.co). If I change the headers in php to redirect my visitors, then facebook is able to show a preview of the website I'm redirecting to when users send links to one another via facebook. I wish to avoid this. Right now, I'm using an AJAX call to get the URL and javascript to redirect, but it's causing problems for users who don't use javascript.
Here are a number of ways I'd like to block facebook, but I can't seem to get working:
I've tried blocking the facebook bot (facebookexternalhit/1.0 and facebookexternalhit/1.1) but it's not working, I don't think they're using them for this functionality.
I'm thinking of blocking the facebook IP addresses, but I can't find all of them, and I don't think it'll work unless I get all of them.
I've thought of using a CAPTCHA or even a button, but I can't bring myself to do that to my visitors. Not to mention I don't think anyone would use the site.
I've searched the facebook docs for meta tags that would "opt-me out", but haven't found one, and doubt that I would trust it if I had.
Any creative ideas or any idea how to implement the ones above? Thank you so much in advance!
I have a secure link direction service I'm running (expiringlinks.co). If I change the headers in php to redirect my visitors, then facebook is able to show a preview of the website I'm redirecting to when users send links to one another via facebook. I wish to avoid this. Right now, I'm using an AJAX call to get the URL and javascript to redirect, but it's causing problems for users who don't use javascript.
Here are a number of ways I'd like to block facebook, but I can't seem to get working:
I've tried blocking the facebook bot (facebookexternalhit/1.0 and facebookexternalhit/1.1) but it's not working, I don't think they're using them for this functionality.
I'm thinking of blocking the facebook IP addresses, but I can't find all of them, and I don't think it'll work unless I get all of them.
I've thought of using a CAPTCHA or even a button, but I can't bring myself to do that to my visitors. Not to mention I don't think anyone would use the site.
I've searched the facebook docs for meta tags that would "opt-me out", but haven't found one, and doubt that I would trust it if I had.
Any creative ideas or any idea how to implement the ones above? Thank you so much in advance!
Share Improve this question asked Nov 19, 2011 at 17:00 Joseph SzymborskiJoseph Szymborski 1,2833 gold badges17 silver badges31 bronze badges 3-
1
How did you learn about
(facebookexternalhit/1.0 and facebookexternalhit/1.1)
? Was it through their docs or have dumped visitor user agents? Personally I'd try setting up a log of all user's user-agents and then creating a link, and getting Facebook to create a preview for this link. If you find one that could be for Facebook, block it, see what happens. Facebook also use several URLs which act as proxies for external content, such ashttp://external.ak.fbcdn/safe_image.php
– user873578 Commented Nov 19, 2011 at 17:38 - I read about the bots online, from their docs and other sources. I've been using Piwik for analytics, and can't detect facebook when I share links. I'm not sure I understand what you mean by the URLs as proxies. – Joseph Szymborski Commented Nov 19, 2011 at 17:54
- They use scripts from domains other than their "facebook." domain to load your content. They also cache the content and if the same content is requested again (like the image), Facebook will load their cached version instead of your version. That may also be in play here if you're trying to link to the same URL more than once. – user873578 Commented Nov 19, 2011 at 19:08
5 Answers
Reset to default 2Try this - it works for me ...
<?php
$ua = $_SERVER['HTTP_USER_AGENT'];
if (preg_match('/facebookexternalhit/si',$ua)) {
header('Location: no_fb_page.php');
die() ;
}
?>
You could try to get the logfile of your Webserver, and search there for unusal useragents. (maybe containing facebook) Or, otherwise get the Logs and delete every containing internet explorer/firefox/opera... Then you should have only bots useragents in the end. Then you could search for the facebook one.
You could try using a meta refresh instead of a javascript redirect. They work for all browsers and because the page still returns a 200 response any crawler should stop resolving there.
It can be done in nginx using geoip2 module.
# this block goes to http { part of config, for example
# /etc/nginx/conf.d/geoip.conf
geoip2 /usr/share/GeoIP/country_asn.mmdb {
# if you have some database update script, you can configure auto reload
# auto_reload 1h;
$geoip2_asn asn;
$geoip2_as_name as_name;
$geoip2_continent continent;
$geoip2_continent_name continent_name;
$geoip2_country country;
}
And use it in location
# put this in location
if ($geoip2_asn = "AS32934") {
return 402;
}
All you need to do is appropriately set up robots.txt.
http://www.robotstxt/robotstxt.html