Javascript regex to reject non ASCII-US characters

^[^\x00-\x1F\x7F-\xFF]+$

This regex will properly fail to match a string that contains non-printing (hex 00-1f) or ASCII extended characters (hex 80-FF), but, unlike PHP, lets non-ASCII utf-8 characters pass. (eg. 日本واستقرارهहिन्दीދިވެހިބަސްગુજરાતી한)

Looking at the wikipedia page on UTF-8 all of those should fall in the 80-ff range. Does anyone know what I'm missing?

Also, if you could explain how to ignore quoted text, you would be my hero forever.

^[^\x00-\x1F\x7F-\xFF]+$

Looking at the wikipedia page on UTF-8 all of those should fall in the 80-ff range. Does anyone know what I'm missing?

Also, if you could explain how to ignore quoted text, you would be my hero forever.

Share Improve this question edited Aug 12, 2010 at 8:34 asked Aug 12, 2010 at 8:23 Greg 7,9228 gold badges45 silver badges69 bronze badges

Add a ment |

1 Answer 1

Sorted by: Reset to default 9

Hmm... instead of rejecting byte ranges, try matching actual Unicode characters, e.g.:

^[\u0020-\u007e]+$

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Javascript regex to reject non ASCII-US characters - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)