最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Converting from a Uint8Array to a string and back - Stack Overflow

programmeradmin2浏览0评论

I'm having an issue converting from a particular Uint8Array to a string and back. I'm working in the browser and in Chrome which natively supports the TextEncoder/TextDecoder modules.

If I start with a simple case, everything seems to work well:

const uintArray = new TextEncoder().encode('silly face demons'); // Uint8Array(17) [115, 105, 108, 108, 121, 32, 102, 97, 99, 101, 32, 100, 101, 109, 111, 110, 115] new TextDecoder().decode(uintArray); // silly face demons

But the following case is not giving me the results I expect. Without getting into too much of the details (it's cryptography related), let's start with the fact that I'm provided with the following Uint8Array:

Uint8Array(24) [58, 226, 7, 102, 202, 238, 58, 234, 217, 17, 189, 208, 46, 34, 254, 4, 76, 249, 169, 101, 112, 102, 140, 208]

and what I want to do is to convert that to a string and then later decrypt the string back to the original array, but I get this:

const uintArray = new Uint8Array([58, 226, 7, 102, 202, 238, 58, 234, 217, 17, 189, 208, 46, 34, 254, 4, 76, 249, 169, 101, 112, 102, 140, 208]); new TextDecoder().decode(uint8Array); // :�f��:����."�L��epf�� new TextEncoder().encode(':�f��:����."�L��epf��');

...which results in: Uint8Array(48) [58, 239, 191, 189, 7, 102, 239, 191, 189, 239, 191, 189, 58, 239, 191, 189, 239, 191, 189, 17, 239, 191, 189, 239, 191, 189, 46, 34, 239, 191, 189, 4, 76, 239, 191, 189, 239, 191, 189, 101, 112, 102, 239, 191, 189, 239, 191, 189]

The array has doubled. Encoding is a bit out of my wheel house. Can anyone tell me why the array has doubled (I'm assuming it's an alternate representation of the original array...?). Also, and more importantly, is there a way I could get back to the original array (i.e. undouble the one I'm getting)?

I'm having an issue converting from a particular Uint8Array to a string and back. I'm working in the browser and in Chrome which natively supports the TextEncoder/TextDecoder modules.

If I start with a simple case, everything seems to work well:

const uintArray = new TextEncoder().encode('silly face demons'); // Uint8Array(17) [115, 105, 108, 108, 121, 32, 102, 97, 99, 101, 32, 100, 101, 109, 111, 110, 115] new TextDecoder().decode(uintArray); // silly face demons

But the following case is not giving me the results I expect. Without getting into too much of the details (it's cryptography related), let's start with the fact that I'm provided with the following Uint8Array:

Uint8Array(24) [58, 226, 7, 102, 202, 238, 58, 234, 217, 17, 189, 208, 46, 34, 254, 4, 76, 249, 169, 101, 112, 102, 140, 208]

and what I want to do is to convert that to a string and then later decrypt the string back to the original array, but I get this:

const uintArray = new Uint8Array([58, 226, 7, 102, 202, 238, 58, 234, 217, 17, 189, 208, 46, 34, 254, 4, 76, 249, 169, 101, 112, 102, 140, 208]); new TextDecoder().decode(uint8Array); // :�f��:����."�L��epf�� new TextEncoder().encode(':�f��:����."�L��epf��');

...which results in: Uint8Array(48) [58, 239, 191, 189, 7, 102, 239, 191, 189, 239, 191, 189, 58, 239, 191, 189, 239, 191, 189, 17, 239, 191, 189, 239, 191, 189, 46, 34, 239, 191, 189, 4, 76, 239, 191, 189, 239, 191, 189, 101, 112, 102, 239, 191, 189, 239, 191, 189]

The array has doubled. Encoding is a bit out of my wheel house. Can anyone tell me why the array has doubled (I'm assuming it's an alternate representation of the original array...?). Also, and more importantly, is there a way I could get back to the original array (i.e. undouble the one I'm getting)?

Share Improve this question edited Jul 17, 2018 at 0:38 robmisio asked Jul 17, 2018 at 0:32 robmisiorobmisio 1,1762 gold badges13 silver badges22 bronze badges 2
  • It is simple: Not all byte values correspond to string characters. Not in ASCII or unicode. Also there is misuse of encrypt/decrypt encode/decode, they are not the same thing. – zaph Commented Jul 17, 2018 at 1:01
  • 3 If you want only to convert it into string and back and get coresponding values, you can do: var str = String.fromCharCode(...uintArray) and then Uint8Array.from([...str].map(ch => ch.charCodeAt())) – bigless Commented Jul 17, 2018 at 1:30
Add a comment  | 

2 Answers 2

Reset to default 12

You have code points in the array that you are trying to convert to utf-8 that don't make sense or are not allowed. Pretty much everything >= 128 requires special handling. Some of these are allowed but are leading bytes for multiple byte sequences and some like 254 are just not allowed. If you want to convert back and forth you will need to make sure you are creating valid utf-8. The codepage layout here might be useful: https://en.wikipedia.org/wiki/UTF-8#Codepage_layout as might the description of illegal byte sequences: https://en.wikipedia.org/wiki/UTF-8#Invalid_byte_sequences.

As a concrete example, this:

let arr = new TextDecoder().decode(new Uint8Array([194, 169]))
let res = new TextEncoder().encode(arr) // => [194, 168]

works because [194, 169] is valid utf-8 for © but:

let arr = new TextDecoder().decode(new Uint8Array([194, 27]))
let res = new TextEncoder().encode(arr) // => [239, 191, 189, 27]

doesn't because it's not a valid sequence.

To get string from Uint8Array and back:

var u8arr = new Uint8Array([34, 128, 255]);
var u8str = u8arr.toString();  // Convert Uint8Array to String
console.log(u8str);
var u8arr2 = Uint8Array.from(u8str.split(',').map(x=>parseInt(x,10)));
console.log(u8arr2);  // back to Uint8Array

This does not suffer from utf-8 issues.

发布评论

评论列表(0)

  1. 暂无评论