最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - international Count sms characters - Stack Overflow

programmeradmin4浏览0评论

I found Count characters/sms using jQuery, but it does not support international characters such as Chinese, Japanese, Thai, etc.

var $remaining = $('#remaining'),
    $messages = $remaining.next();

$('#message').keyup(function(){
    var chars = this.value.length,
        messages = Math.ceil(chars / 160),
        remaining = messages * 160 - (chars % (messages * 160) || messages * 160);

    $remaining.text(remaining + ' characters remaining');
    $messages.text(messages + ' message(s)');
});

Here are some examples of incorrect character counts:

您好,請問你吃飯了嗎? << 11 characters

สวัสดีคุณกินหรือ? << 17 characters

こんにちは、あなたは食べていますか? << 18 characters

안녕하세요, 당신이 먹는 거죠? << 17 characters

हैलो, आप खाते हैं? << 18 characters

Добры дзень, вы ясьце? << 22 characters

How can I make this work with non-ASCII characters?

I found Count characters/sms using jQuery, but it does not support international characters such as Chinese, Japanese, Thai, etc.

var $remaining = $('#remaining'),
    $messages = $remaining.next();

$('#message').keyup(function(){
    var chars = this.value.length,
        messages = Math.ceil(chars / 160),
        remaining = messages * 160 - (chars % (messages * 160) || messages * 160);

    $remaining.text(remaining + ' characters remaining');
    $messages.text(messages + ' message(s)');
});

Here are some examples of incorrect character counts:

您好,請問你吃飯了嗎? << 11 characters

สวัสดีคุณกินหรือ? << 17 characters

こんにちは、あなたは食べていますか? << 18 characters

안녕하세요, 당신이 먹는 거죠? << 17 characters

हैलो, आप खाते हैं? << 18 characters

Добры дзень, вы ясьце? << 22 characters

How can I make this work with non-ASCII characters?

Share Improve this question edited May 23, 2017 at 11:43 CommunityBot 11 silver badge asked Mar 28, 2011 at 5:07 IronmanIronman 4491 gold badge5 silver badges13 bronze badges 3
  • Seems to count just fine for the most part. 您好,請問你吃飯了嗎? is 11 characters long, and the numbers for Japanese, Korean and Russian are fine as well. What numbers would you be expecting? Only Thai and Hindi may be off, but I don't know how characters are counted there. – deceze Commented Mar 28, 2011 at 5:21
  • Yes, as you said Thai and Hindi is diference, ดี is already 2 characters, so, as above jquery i found, how to make it support international chinese, thai, japanese, korean, hindi, russian.. – Ironman Commented Mar 28, 2011 at 5:32
  • 2 Yes, but "您" is one UTF-8 character. Apparently you want to count bytes, not characters? – deceze Commented Mar 28, 2011 at 6:06
Add a ment  | 

1 Answer 1

Reset to default 10

You can't really count in "characters" here. According to the SMS article on Wikipedia one of three different encodings are used for SMS (7-bit GSM, 8-bit GSM and UTF-16). So first you'll need to know/decide which encoding you'll be using.

If you know you'll always be using UTF-16, then you can count the number of 16-bit code units a string will take up. A standard SMS can consist of 70 16-bit code units. But this will limit messages in Latin characters to 70, too. So if you want to use the full 160 characters (with 7-bit encoding) or 140 characters (with 8-bit encoding) for Latin characters, then you'll need to distinguish between the three cases.

Example for counting UTF-16 16-bit code units:

var message = "您好,請問你吃飯了嗎?";

var utf16codeUnits = 0;

for (var i = 0, len = message.length; i < len; i++) {
  utf16codeUnits += message.charCodeAt(i) < 0x10000 ? 1 : 2;
}

BTW, this will e up with then same numbers you posted as "incorrect", so you'll need to explain why you consider them incorrect.


EDIT

Despite being accepted already I quickly hacked up a function that correctly (as far as I can say) calculates the GSM 7-bit (if possible) and UTF-16 sizes of a SMS message: http://jsfiddle/puKJb/

发布评论

评论列表(0)

  1. 暂无评论