I have a mixed source of unicode and ascii characters, for example:
var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";
How do I convert it to a string by leveraging and extending the below uniCodeToString function written by myself in Javascript? This function can convert pure unicode to string.
function uniCodeToString(source){
//for example, source = "\u5c07\u63a2\u8a0e"
var escapedSource = escape(source);
var codeArray = escapedSource.split("%u");
var str = "";
for(var i=1; i<codeArray.length; i++){
str += String.fromCharCode("0x"+codeArray[i]);
}
return str;
}
I have a mixed source of unicode and ascii characters, for example:
var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";
How do I convert it to a string by leveraging and extending the below uniCodeToString function written by myself in Javascript? This function can convert pure unicode to string.
function uniCodeToString(source){
//for example, source = "\u5c07\u63a2\u8a0e"
var escapedSource = escape(source);
var codeArray = escapedSource.split("%u");
var str = "";
for(var i=1; i<codeArray.length; i++){
str += String.fromCharCode("0x"+codeArray[i]);
}
return str;
}
Share
Improve this question
edited Jun 19, 2011 at 5:53
Lance Roberts
22.9k32 gold badges114 silver badges132 bronze badges
asked Jun 19, 2011 at 5:37
Ben JianBen Jian
211 gold badge1 silver badge3 bronze badges
1
-
This question doesn't really make a lot of sense. The source string you've quoted is 13 characters long and doesn't have any "u"s in it at all; did you mean
var source = "\\u5c07\\u63a2\\u8a0e HTML5 \\u53ca\\u5176\\u4ed6";
? What's your actual underlying technical problem? The real source data, and the real desired end result? – T.J. Crowder Commented Jun 19, 2011 at 5:50
2 Answers
Reset to default 4Use encodeURIComponent, escape was never meant for unicode.
var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";
var enc=encodeURIComponent(source)
//returned value: (String)
%E5%B0%87%E6%8E%A2%E8%A8%8E%20HTML5%20%E5%8F%8A%E5%85%B6%E4%BB%96
decodeURIComponent(enc)
//returned value: (String)
將探討 HTML5 及其他
I think you are misunderstanding the purpose of Unicode escape sequences.
var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";
JavaScript strings are always Unicode (each code unit is a 16 bit UTF-16 encoded value.) The purpose of the escapes is to allow you to describe values that are unsupported by the encoding used to save the source file (e.g. the HTML page or .JS file is encoded as ISO-8859-1) or to overe things like keyboard limitations. This is no different to using \n
to indicate a linefeed code point.
The above string ("將探討 HTML5 及其他") is made up of the values 5c07 63a2 8a0e 0020 0048 0054 004d 004c 0035 0020 53ca 5176 4ed6
whether you write the sequence as a literal or in escape sequences.
See the String Literals section of ECMA-262 for more details.