I've e across a problem in that I don't see a way to easily convert strings to typed arrays, and converting from typed arrays to strings appears to be a real pain requiring a manual char code conversion for every byte. Is there any better methods to convert strings to typed arrays or vise-versa?
Example:
I have a UTF8 encoded string, "Something or other", and I want to write it to an ArrayBuffer in length then string format.
I've e across a problem in that I don't see a way to easily convert strings to typed arrays, and converting from typed arrays to strings appears to be a real pain requiring a manual char code conversion for every byte. Is there any better methods to convert strings to typed arrays or vise-versa?
Example:
I have a UTF8 encoded string, "Something or other", and I want to write it to an ArrayBuffer in length then string format.
Share Improve this question edited Nov 16, 2011 at 15:24 Adam M-W asked Nov 16, 2011 at 14:59 Adam M-WAdam M-W 3,5399 gold badges50 silver badges69 bronze badges 3- What kind of conversion do you want exactly? Can you give us an example? – Rodolphe Commented Nov 16, 2011 at 15:20
-
2
Typed Arrays are only implemented for numbers. You somehow need to convert a string into numbers, i.e.
str.charCodeAt
andString.fromCharCode
. You can make it more convenient, though: jsfiddle/ACXM7. – pimvdb Commented Nov 16, 2011 at 15:32 - Why do you want to store them in typed arrays at all? That sounds like you wanted to simulate the internal representation of a simple js string :-) – Bergi Commented Sep 27, 2012 at 11:51
3 Answers
Reset to default 5You can use TextEncoder and TextDecoder, it was originally only implemented in Firefox, but has since gained wider support. This two full years after your initial question :)
Fortunately, there exists a polyfill that does the job.
var initialString = "Something or other";
var utf8EncodedString = new TextEncoder("utf-8").encode(initialString);
// utf8EncodedString is a Uint8Array, so you can inspect
// the individual bytes directly:
for (var i = 0; i < utf8EncodedString.length; ++i) {
console.log(utf8EncodedString[i]);
}
var decodedString = new TextDecoder("utf-8").decode(utf8EncodedString);
if (initialString !== decodedString) {
console.error("You're lying!");
}
This should solve your problem
How to use strings with JavaScript Typed Arrays
JS Strings are stored in UTF-16 encoding where each character takes 2 bytes. String.charCodeAt returns these 2-byte Unicodes. This is how to read UTF-16 encoded strings from a DataView:
DataView.prototype.getUTF16String = function(offset, length) {
var utf16 = new ArrayBuffer(length * 2);
var utf16View = new Uint16Array(utf16);
for (var i = 0; i < length; ++i) {
utf16View[i] = this.getUint8(offset + i);
}
return String.fromCharCode.apply(null, utf16View);
};
and these functions to convert string to and from arraybuffer
function ab2str(buf) {
return String.fromCharCode.apply(null, new Uint16Array(buf));
}
function str2ab(str) {
var buf = new ArrayBuffer(str.length*2); // 2 bytes for each char
var bufView = new Uint16Array(buf);
for (var i=0, strLen=str.length; i<strLen; i++) {
bufView[i] = str.charCodeAt(i);
}
return buf;
}
If you are willing to use external libraries, you can take a look at jDataView which supports reading strings in different encodings from a TypedArray and handles endianness conversion as well.