最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

How to convert mixed ascii and unicode to a string in javascript? - Stack Overflow

programmeradmin1浏览0评论

I have a mixed source of unicode and ascii characters, for example:

var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";

How do I convert it to a string by leveraging and extending the below uniCodeToString function written by myself in Javascript? This function can convert pure unicode to string.

function uniCodeToString(source){
    //for example, source = "\u5c07\u63a2\u8a0e"
    var escapedSource = escape(source);
    var codeArray = escapedSource.split("%u");
    var str = "";
    for(var i=1; i<codeArray.length; i++){
        str += String.fromCharCode("0x"+codeArray[i]);
    }
    return str;
}

I have a mixed source of unicode and ascii characters, for example:

var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";

How do I convert it to a string by leveraging and extending the below uniCodeToString function written by myself in Javascript? This function can convert pure unicode to string.

function uniCodeToString(source){
    //for example, source = "\u5c07\u63a2\u8a0e"
    var escapedSource = escape(source);
    var codeArray = escapedSource.split("%u");
    var str = "";
    for(var i=1; i<codeArray.length; i++){
        str += String.fromCharCode("0x"+codeArray[i]);
    }
    return str;
}
Share Improve this question edited Jun 19, 2011 at 5:53 Lance Roberts 22.9k32 gold badges114 silver badges132 bronze badges asked Jun 19, 2011 at 5:37 Ben JianBen Jian 211 gold badge1 silver badge3 bronze badges 1
  • This question doesn't really make a lot of sense. The source string you've quoted is 13 characters long and doesn't have any "u"s in it at all; did you mean var source = "\\u5c07\\u63a2\\u8a0e HTML5 \\u53ca\\u5176\\u4ed6";? What's your actual underlying technical problem? The real source data, and the real desired end result? – T.J. Crowder Commented Jun 19, 2011 at 5:50
Add a ment  | 

2 Answers 2

Reset to default 4

Use encodeURIComponent, escape was never meant for unicode.

   var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";


    var enc=encodeURIComponent(source)

   //returned value: (String)
    %E5%B0%87%E6%8E%A2%E8%A8%8E%20HTML5%20%E5%8F%8A%E5%85%B6%E4%BB%96

    decodeURIComponent(enc)

    //returned value: (String)
    將探討 HTML5 及其他

I think you are misunderstanding the purpose of Unicode escape sequences.

var source = "\u5c07\u63a2\u8a0e HTML5 \u53ca\u5176\u4ed6";

JavaScript strings are always Unicode (each code unit is a 16 bit UTF-16 encoded value.) The purpose of the escapes is to allow you to describe values that are unsupported by the encoding used to save the source file (e.g. the HTML page or .JS file is encoded as ISO-8859-1) or to overe things like keyboard limitations. This is no different to using \n to indicate a linefeed code point.

The above string ("將探討 HTML5 及其他") is made up of the values 5c07 63a2 8a0e 0020 0048 0054 004d 004c 0035 0020 53ca 5176 4ed6 whether you write the sequence as a literal or in escape sequences.

See the String Literals section of ECMA-262 for more details.

发布评论

评论列表(0)

  1. 暂无评论