I found Count characters/sms using jQuery, but it does not support international characters such as Chinese, Japanese, Thai, etc.
var $remaining = $('#remaining'),
$messages = $remaining.next();
$('#message').keyup(function(){
var chars = this.value.length,
messages = Math.ceil(chars / 160),
remaining = messages * 160 - (chars % (messages * 160) || messages * 160);
$remaining.text(remaining + ' characters remaining');
$messages.text(messages + ' message(s)');
});
Here are some examples of incorrect character counts:
您好,請問你吃飯了嗎? << 11 characters
สวัสดีคุณกินหรือ? << 17 characters
こんにちは、あなたは食べていますか? << 18 characters
안녕하세요, 당신이 먹는 거죠? << 17 characters
हैलो, आप खाते हैं? << 18 characters
Добры дзень, вы ясьце? << 22 characters
How can I make this work with non-ASCII characters?
I found Count characters/sms using jQuery, but it does not support international characters such as Chinese, Japanese, Thai, etc.
var $remaining = $('#remaining'),
$messages = $remaining.next();
$('#message').keyup(function(){
var chars = this.value.length,
messages = Math.ceil(chars / 160),
remaining = messages * 160 - (chars % (messages * 160) || messages * 160);
$remaining.text(remaining + ' characters remaining');
$messages.text(messages + ' message(s)');
});
Here are some examples of incorrect character counts:
您好,請問你吃飯了嗎? << 11 characters
สวัสดีคุณกินหรือ? << 17 characters
こんにちは、あなたは食べていますか? << 18 characters
안녕하세요, 당신이 먹는 거죠? << 17 characters
हैलो, आप खाते हैं? << 18 characters
Добры дзень, вы ясьце? << 22 characters
How can I make this work with non-ASCII characters?
Share Improve this question edited May 23, 2017 at 11:43 CommunityBot 11 silver badge asked Mar 28, 2011 at 5:07 IronmanIronman 4491 gold badge5 silver badges13 bronze badges 3-
Seems to count just fine for the most part.
您好,請問你吃飯了嗎?
is 11 characters long, and the numbers for Japanese, Korean and Russian are fine as well. What numbers would you be expecting? Only Thai and Hindi may be off, but I don't know how characters are counted there. – deceze ♦ Commented Mar 28, 2011 at 5:21 - Yes, as you said Thai and Hindi is diference, ดี is already 2 characters, so, as above jquery i found, how to make it support international chinese, thai, japanese, korean, hindi, russian.. – Ironman Commented Mar 28, 2011 at 5:32
- 2 Yes, but "您" is one UTF-8 character. Apparently you want to count bytes, not characters? – deceze ♦ Commented Mar 28, 2011 at 6:06
1 Answer
Reset to default 10You can't really count in "characters" here. According to the SMS article on Wikipedia one of three different encodings are used for SMS (7-bit GSM, 8-bit GSM and UTF-16). So first you'll need to know/decide which encoding you'll be using.
If you know you'll always be using UTF-16, then you can count the number of 16-bit code units a string will take up. A standard SMS can consist of 70 16-bit code units. But this will limit messages in Latin characters to 70, too. So if you want to use the full 160 characters (with 7-bit encoding) or 140 characters (with 8-bit encoding) for Latin characters, then you'll need to distinguish between the three cases.
Example for counting UTF-16 16-bit code units:
var message = "您好,請問你吃飯了嗎?";
var utf16codeUnits = 0;
for (var i = 0, len = message.length; i < len; i++) {
utf16codeUnits += message.charCodeAt(i) < 0x10000 ? 1 : 2;
}
BTW, this will e up with then same numbers you posted as "incorrect", so you'll need to explain why you consider them incorrect.
EDIT
Despite being accepted already I quickly hacked up a function that correctly (as far as I can say) calculates the GSM 7-bit (if possible) and UTF-16 sizes of a SMS message: http://jsfiddle/puKJb/