最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - What encoding is expected for Node.js source code? - Stack Overflow

programmeradmin0浏览0评论

I’ve done some Google searches, but I get results related to encoding strings or files.

Can I write my Node.js JavaScript source code in UTF-8? Can I use non-ASCII characters in ments, strings, or as variable names?

ECMA-262 seems to require UTF-16 encoding, but Node.js won’t run a UTF-16 encoded .js file. It will, however run UTF-8 source and correctly interpret non-ASCII characters.

So is this by design or by “accident”? Is it specified somewhere that UTF-8 source code is supported?

I’ve done some Google searches, but I get results related to encoding strings or files.

Can I write my Node.js JavaScript source code in UTF-8? Can I use non-ASCII characters in ments, strings, or as variable names?

ECMA-262 seems to require UTF-16 encoding, but Node.js won’t run a UTF-16 encoded .js file. It will, however run UTF-8 source and correctly interpret non-ASCII characters.

So is this by design or by “accident”? Is it specified somewhere that UTF-8 source code is supported?

Share Improve this question asked Apr 12, 2012 at 14:02 NateNate 19.1k9 gold badges50 silver badges54 bronze badges 8
  • 1 I've never given this a second though, but I constantly use UTF-8 for everything I do and never had a problem. – Alex Turpin Commented Apr 12, 2012 at 14:05
  • 1 I expect that it's not so much a Node.js thing, but a V8 thing. – Pointy Commented Apr 12, 2012 at 14:07
  • 1 I was hoping someone could point to, say, Node.js or V8 documentation that says what source encodings are allowed. (Python example: python/dev/peps/pep-0263). Yeah, I can and did futz around and see what works, but I want a more concrete answer. – Nate Commented Apr 12, 2012 at 15:12
  • You're linking to a very old version of the spec (3rd rev. is from 1999, we just hit 6th rev. last June). The current version is here. The requirement is "unicode" (with, by convention, ASCII being a subset of unicode, since the lower 127 codepoints in unicode are the same as the ASCII encoding specifies) – Mike 'Pomax' Kamermans Commented Sep 11, 2015 at 17:07
  • Hi @Nate , it seems some years have past from when you asked this question. I'm seeking for something like the Python example you wrote in the ment. Had you found a concrete answer in the meanwhile? – Daniele Ricci Commented Nov 11, 2021 at 12:42
 |  Show 3 more ments

2 Answers 2

Reset to default 0

Reference: http://mathiasbynens.be/notes/javascript-identifiers

UTF-8 characters are valid javascript variable names. Go ahead and encode UTF-8.

I can't find documentation that says that Node treats files as encoded in UTF-8, but it seems that way experimentally:

/* Check in your editor that this Javascript file was saved in UTF-8 */
var nonEscaped = "Планета_Зямля";
var escaped = "\u041f\u043b\u0430\u043d\u0435\u0442\u0430\u005f\u0417\u044f\u043c\u043b\u044f";
if (nonEscaped === escaped) {
  console.log("They match");
}

The above example prints They match.

Non-BMP note:

Note that UTF-8 supports non-BMP code points (U+10000 and onwards), but Javascript has plications in that case, it automatically converts them to surrogate pairs. This is part of the language:

/* Check in your editor that this Javascript file was saved in UTF-8 */
var nonEscaped = "
发布评论

评论列表(0)

  1. 暂无评论