javascript - Use File Content to Determine MIME Type with Node JS

It seems all of the popular MIME type libraries for node.js just use the file name extension rather than peeking into the file to determine the MIME type.

Is there a good way to use Node to jump into the file and intelligently determine the file's MIME type in case an extension is not present?

It seems all of the popular MIME type libraries for node.js just use the file name extension rather than peeking into the file to determine the MIME type.

Is there a good way to use Node to jump into the file and intelligently determine the file's MIME type in case an extension is not present?

Share Improve this question asked Jul 9, 2014 at 20:10 Kirk Ouimet 28.4k44 gold badges130 silver badges182 bronze badges

Add a ment |

2 Answers 2

Sorted by: Reset to default 11

That indeed feels like a pity, that most popular MIME modules are just mapping extension to the type.

After searching deeper, I found the module called mmmagic, it seems to be doing exactly what you want.

Be aware, that from working with MIME I was left with a taste, that MIME detection is in principle not pletely reliable, and there is a rare chance of false detections.

Example of usage (taken from their site):

  var mmm = require('mmmagic'),
      Magic = mmm.Magic;

  var magic = new Magic(mmm.MAGIC_MIME_TYPE);
  magic.detectFile('node_modules/mmmagic/build/Release/magic.node', function(err, result) {
      if (err) throw err;
      console.log(result);
      // output on Windows with 32-bit node:
      //    application/x-dosexec
  });

Since MIME does not at all dictate anything about the file contents format, you can only employ heuristics to guess what is going on in a file:

Some binary formats have something called a magic number, but those can be wrong or ambiguous. See this wikipedia article for more info.
Many text file formats contain grammar constructs that you can use for a simple pattern matching test. E.g. xml, csv or json. However some formats (e.g. HTML), have a rather "evolved" syntax definition making it ambiguous and thus hard to pattern match.

To better illustrate the issue of ambiguity, here is an example: Browsers have developed a very very high tolerance, and accept anything that remotely resembles HTML thus a HTML (or even XHTML) file format is hard to identify. Not to mention the fact that HTML files could actually be non-HTML template languages (such as jade, handlebars, angular templates etc...). This is just one of many examples where things get very ambiguous.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

javascript - Use File Content to Determine MIME Type with Node JS - Stack Overflow

2 Answers 2

与本文相关的文章

评论列表(0)