This works great to disallow all non-English letters:
/[^\x00-\x7F]+/
But I would like to allow these characters:
âäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ
How do I add those to the regex so that they are allowed?
This works great to disallow all non-English letters:
/[^\x00-\x7F]+/
But I would like to allow these characters:
âäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ
How do I add those to the regex so that they are allowed?
Share Improve this question asked Jun 11, 2017 at 18:37 pk1557pk1557 5467 silver badges21 bronze badges 2- Yes, the original set of Unicode characters (C0 and Basic Latin block) you cite includes the letters of the English alphabet but is insufficient for English text. Perhaps you should include ê and ff, too. – Tom Blodget Commented Jun 11, 2017 at 23:26
- Came here on a Google search looking for a way to remove all non-English characters. Your original regex of /[^\x00-\x7F]+/ works great for my needs, so thanks! – Crabgrass0899 Commented Jul 18, 2024 at 5:21
2 Answers
Reset to default 5If the pattern like /[^\x00-\x7F]+/
works for you, it matches all the letters you now want to avoid matching.
Since the [^...]
is a negated character class, the easiest way to exclude a char/set of chars is to just add them to the class:
/[^\x00-\x7FâäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ]+/
See the regex demo.
If you use an empty string as the replacement pattern, you will remove every 1+ chars that are not ASCII (\x00-\x7F
) and that are not equal to the letters added to the negated character class.
Though it looks long one but a simple character class would do the job.
Regex: [a-zA-ZâäèéêëîïôœùûüÿçÀÂÄÈÉÊËÎÏÔŒÙÛÜŸÇ]