最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - FileReader.readAsArrayBuffer handling of non-ASCII including £ (pound sterling) - Stack Overflow

programmeradmin4浏览0评论

I am using FileReader.readAsArrayBuffer(file) and converting the result into a Uint8Array.

If the text file input contains a pound sterling sign (£), then this single character results in two byte codes, one for  and one for £. I understand that this is because £ is in the extended-ASCII set.

Is there a way to prevent this extra character? If not, will it always be an Â? If so, I can strip them out.

I am using FileReader.readAsArrayBuffer(file) and converting the result into a Uint8Array.

If the text file input contains a pound sterling sign (£), then this single character results in two byte codes, one for  and one for £. I understand that this is because £ is in the extended-ASCII set.

Is there a way to prevent this extra character? If not, will it always be an Â? If so, I can strip them out.

Share asked Mar 7 at 13:24 hobbes_childhobbes_child 1433 silver badges17 bronze badges 2
  • Extended ASCII still uses only one byte per character. £ is what you get, when a pound sign that was encoded in UTF-8, gets interpreted as ASCII/ISO-8859-x. Was the file you are reading saved with a BOM indicating its encoding, or without? – C3roe Commented Mar 7 at 13:31
  • @C3roe it was just saved in Notepad++. I tried adding the BOM before doing the readAsArrayBuffer but it came out 100s of characters shorter. I did this stackoverflow/a/63331294/1071463 – hobbes_child Commented Mar 7 at 13:56
Add a comment  | 

1 Answer 1

Reset to default 1

You didn't provide your js code, But it seems this happen due to a mismatch between the character encoding of the text file and how you're js interpreting it. If i assume you're reading the file as text, maybe i was right in my thinking. I will just drop this playground to give you a reference and hoping you will solve your problem.

function checkFile(file){
    const fileReader = new FileReader();
    fileReader.onload = function(event) {
        const uint8Array = new Uint8Array(event.target.result);
        
        // Use TextDecoder to convert Uint8Array into string 
        const textDecoder = new TextDecoder('utf-8', { fatal: true });
        try{
            const result = textDecoder.decode(uint8Array);
            console.log(result); // This should correctly display the pound sign and show £ without Â.
        }catch(error){
            console.error('Decoding was Failed:', error);
        }
    };
    fileReader.readAsArrayBuffer(file);
}

function uploadFile(){
    const file = event.target.files[0];
    if(file){
       checkFile(file);
    }
}
<input type="file" onchange="uploadFile()" />

发布评论

评论列表(0)

  1. 暂无评论