I'm trying to detect for either of the following 2 options:
- A specific list of bots (FacebookExternalHit|LinkedInBot|TwitterBot|Baiduspider)
- Any bots that don't support the Crawable Ajax Specification
I've seen similar questions (How to recognize Facebook User-Agent) but nothing that explains how to do this in Node and Express.
I need to do this in a format like this:
app.get("*", function(req, res){
if (is one of the bots) //serve snapshot
if (is not one of the bots) res.sendFile(__dirname + "/public/index.html");
});
I'm trying to detect for either of the following 2 options:
- A specific list of bots (FacebookExternalHit|LinkedInBot|TwitterBot|Baiduspider)
- Any bots that don't support the Crawable Ajax Specification
I've seen similar questions (How to recognize Facebook User-Agent) but nothing that explains how to do this in Node and Express.
I need to do this in a format like this:
app.get("*", function(req, res){
if (is one of the bots) //serve snapshot
if (is not one of the bots) res.sendFile(__dirname + "/public/index.html");
});
Share
Improve this question
edited May 23, 2017 at 12:25
CommunityBot
11 silver badge
asked Mar 25, 2015 at 16:51
CaribouCodeCaribouCode
14.4k33 gold badges111 silver badges183 bronze badges
3 Answers
Reset to default 11You can check the header User-Agent
in the request object and test its value for different bots,
As of now, Facebook says they have three types of User-Agent header values ( check The Facebook Crawler ), Also twitter has a User-Agent with versions ( check Twitter URL Crawling & Caching ), the below example should cover both bots.
Node
var http = require('http');
var server = http.createServer(function(req, res){
var userAgent = req.headers['user-agent'];
if (userAgent.startsWith('facebookexternalhit/1.1') ||
userAgent === 'Facebot' ||
userAgent.startsWith('Twitterbot') {
/* Do something for the bot */
}
});
server.listen(8080);
Express
var http = require('http');
var express = require('express');
var app = express();
app.get('/', function(req, res){
var userAgent = req.headers['user-agent'];
if (userAgent.startsWith('facebookexternalhit/1.1') ||
userAgent === 'Facebot' ||
userAgent.startsWith('Twitterbot') {
/* Do something for the bot */
}
});
app.listen(8080);
What you can do is use the request.headers
object to check if the incoming request contains any UA information specific to that bot. A simple example.
Node
var http = require('http');
var server = http.createServer(function(req, res){
if(req.headers['user-agent'] === 'facebookexternalhit/1.1') /* do something for the Facebook bot */
});
server.listen(8080);
Express
var http = require('http');
var express = require('express');
var app = express();
app.get('/', function(req, res){
if(req.headers['user-agent'] === 'facebookexternalhit/1.1') /* do something for the Facebook bot */
});
app.listen(8080);
This node express middleware will analyze a bunch of different user agent strings and give you just a "bot==true" or "desktop==true" way to determine. I haven't used it and the readme sounds like it was just a trial project so I don't know how maintained it will be going forward, but it will detect all sorts of bots.
https://github.com/rguerreiro/express-device