最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - XMLHttpRequest returns wrongly encoded characters - Stack Overflow

programmeradmin3浏览0评论

I use XMLHttpRequest to read the PDF document .pdf

%PDF-1.3
%âãÏÓ
[...]

and print its content out to console:

var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function() {
    if (xhr.readyState === 4 && xhr.status === 200) {
      console.log(xhr.responseText);
      console.log('âãÏÓ');
    }
};
xhr.open('GET', '.pdf', true);
xhr.send();

However, the console says

%PDF-1.3
%����
[...]
âãÏÓ

(The last line is from the reference console.log above to verify that the console can actually display those characters.) Apparently, the characters are wrongly encoded at some point. What's going wrong and how to fix this?

I use XMLHttpRequest to read the PDF document http://www.virtualmechanics./support/tutorials-spinner/Simple2.pdf

%PDF-1.3
%âãÏÓ
[...]

and print its content out to console:

var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function() {
    if (xhr.readyState === 4 && xhr.status === 200) {
      console.log(xhr.responseText);
      console.log('âãÏÓ');
    }
};
xhr.open('GET', 'http://www.virtualmechanics./support/tutorials-spinner/Simple2.pdf', true);
xhr.send();

However, the console says

%PDF-1.3
%����
[...]
âãÏÓ

(The last line is from the reference console.log above to verify that the console can actually display those characters.) Apparently, the characters are wrongly encoded at some point. What's going wrong and how to fix this?

Share Improve this question edited Apr 28, 2015 at 11:11 Nico Schlömer asked Apr 28, 2015 at 10:35 Nico SchlömerNico Schlömer 59k35 gold badges213 silver badges284 bronze badges 3
  • Probably your console font simply does not have glyphs for âãÏÓ... – mkl Commented Apr 28, 2015 at 10:43
  • @mkl Yes, it has. I edited the question accordingly. – Nico Schlömer Commented Apr 28, 2015 at 10:46
  • 1 Arg, I did not see immediately, you use XMLHttpRequest.responseText. This property already tries to interpret the response as text and seems to fail. PDF files are not text files and, therefore, shall not be treated as such. You may want to try working with XMLHttpRequest.response instead, also cf. the MDN Sending and Receiving Binary Data page. – mkl Commented Apr 28, 2015 at 11:32
Add a ment  | 

2 Answers 2

Reset to default 5

XMLHttpRequest's default response type is text, but here one is actually dealing with binary data. Eric Bidelman describes how to work with it.

The solution to the problem is to read the data as a Blob, then to extract the data from the blob and plug it into hash.update(..., 'binary'):

var xhr = new XMLHttpRequest();
xhr.open('GET', details.url, true);
xhr.responseType = 'blob';
xhr.onload = function() {
  if (this.status === 200) {
    var a = new FileReader();
    a.readAsBinaryString(this.response);
    a.onloadend = function() {
      var hash = crypto.createHash('sha1');
      hash.update(a.result, 'binary');
      console.log(hash.digest('hex'));
    };
  }
};
xhr.send(null);

The MIME type of your file might not be UTF-8. Try overriding it as suggested here and depicted below:

xhr.open('GET', 'http://www.virtualmechanics./support/tutorials-spinner/Simple2.pdf', true);
xhr.overrideMimeType('text/xml; charset=iso-8859-1');
xhr.send();
发布评论

评论列表(0)

  1. 暂无评论