最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Regular expression to reject all non-English characters except some characters with accents - Stack Overflow

programmeradmin3浏览0评论

This works great to disallow all non-English letters:

/[^\x00-\x7F]+/

But I would like to allow these characters:

âäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ

How do I add those to the regex so that they are allowed?

This works great to disallow all non-English letters:

/[^\x00-\x7F]+/

But I would like to allow these characters:

âäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ

How do I add those to the regex so that they are allowed?

Share Improve this question asked Jun 11, 2017 at 18:37 pk1557pk1557 5467 silver badges21 bronze badges 2
  • Yes, the original set of Unicode characters (C0 and Basic Latin block) you cite includes the letters of the English alphabet but is insufficient for English text. Perhaps you should include ê and ff, too. – Tom Blodget Commented Jun 11, 2017 at 23:26
  • Came here on a Google search looking for a way to remove all non-English characters. Your original regex of /[^\x00-\x7F]+/ works great for my needs, so thanks! – Crabgrass0899 Commented Jul 18, 2024 at 5:21
Add a ment  | 

2 Answers 2

Reset to default 5

If the pattern like /[^\x00-\x7F]+/ works for you, it matches all the letters you now want to avoid matching.

Since the [^...] is a negated character class, the easiest way to exclude a char/set of chars is to just add them to the class:

/[^\x00-\x7FâäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ]+/

See the regex demo.

If you use an empty string as the replacement pattern, you will remove every 1+ chars that are not ASCII (\x00-\x7F) and that are not equal to the letters added to the negated character class.

Though it looks long one but a simple character class would do the job.

Regex: [a-zA-ZâäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ]

发布评论

评论列表(0)

  1. 暂无评论