最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c# - validate string - only specific language characters - Stack Overflow

programmeradmin10浏览0评论

Is there a way to check if a string contains characters of a given language only? (for example Japanese, Hebrew, Arabic)

I'm wondering if there is a way implementing this kind of validation in Javascript\Jquery and in c#?

EDIT

I'm not willing to check if the string contains valid words of a specific language dictionary. I'd like to validate that all characters belong to that language.

Is there a way to check if a string contains characters of a given language only? (for example Japanese, Hebrew, Arabic)

I'm wondering if there is a way implementing this kind of validation in Javascript\Jquery and in c#?

EDIT

I'm not willing to check if the string contains valid words of a specific language dictionary. I'd like to validate that all characters belong to that language.

Share Improve this question edited Aug 17, 2009 at 16:09 Larsenal 51.2k43 gold badges154 silver badges224 bronze badges asked Aug 17, 2009 at 15:37 CD..CD.. 74.2k25 gold badges159 silver badges169 bronze badges 4
  • Are you talking about language or character set? They're not the same thihg. – Craig Stuntz Commented Aug 17, 2009 at 15:41
  • Many languages use the same character set. For example, nearly all of Western Europe. OTOH, every language in your example list uses different character sets than every other language in your list. All of your examples can be distinguished by character set. English and Hungarian, OTOH, cannot. – Craig Stuntz Commented Aug 17, 2009 at 16:00
  • Also, note that "English" text can include ligatures, umlauts (coöperation), quotes from non-English text, etc. It seems to me that you are asking us to give you a specific solution, rather than stating what the real problem is. What problem are you really trying to solve? – Craig Stuntz Commented Aug 17, 2009 at 16:04
  • I've got a textbox on a web page where I want users to enter there name in Hebrew/Arabic only. – CD.. Commented Aug 17, 2009 at 16:15
Add a ment  | 

5 Answers 5

Reset to default 4

@CD, so sure you can do that.

In C#, just:

string str = "this text has arabic characters";
bool hasArabicCharacters = str.Any(c => c >= 0xFB50 && c <= 0xFEFC);

This is for Arabic text , but I didn't test it for other languages

^[\u0621-\u064A\040]+$

Maybe using a regular expression with UNICODE charset?

No, you cant check exact language. You can check only those chars which there no in other languages. For example cyriclics, hieroglyphs etc. Like a hint you can use google translate api to define in what lanuage user enters the text.

internal bool HasArabicCharacters(string text)
{
    Regex regex = new Regex(
        "[\u0600-\u06ff]|[\u0750-\u077f]|[\ufb50-\ufc3f]|[\ufe70-\ufefc]");
    return regex.IsMatch(text);
} 
发布评论

评论列表(0)

  1. 暂无评论