最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Javascript Regex can't match ellipsis - Stack Overflow

programmeradmin0浏览0评论

The current REGEX I'm using is the following one:

var sentences = fulltext.match(/[^\.!\?]+[\.!\?]+/g);

That returns an array with the sentences split INCLUDING the spaces (I need all the characters). Problem is, it does not work with ellipsis "..." and I guess neither it does with other unconventional forms of punctuation.

How can I fix my REGEX to match this and other forms of punctuation?

Is there any noob friendly example driven guide to REGEX out there?

The current REGEX I'm using is the following one:

var sentences = fulltext.match(/[^\.!\?]+[\.!\?]+/g);

That returns an array with the sentences split INCLUDING the spaces (I need all the characters). Problem is, it does not work with ellipsis "..." and I guess neither it does with other unconventional forms of punctuation.

How can I fix my REGEX to match this and other forms of punctuation?

Is there any noob friendly example driven guide to REGEX out there?

Share Improve this question edited Jan 26, 2014 at 1:53 BenMorel 36.7k51 gold badges205 silver badges336 bronze badges asked Jan 25, 2014 at 22:54 BelohlavekBelohlavek 1673 silver badges13 bronze badges 2
  • 2 Ellipsis also have their own character / code point -- U+2026 or \u2026 -- that are distinct from 3 consecutive .s (U+002E). – Jonathan Lonowski Commented Jan 25, 2014 at 22:58
  • possible duplicate of Javascript regular expression for punctuation (international)? – Jonathan Lonowski Commented Jan 25, 2014 at 23:06
Add a ment  | 

2 Answers 2

Reset to default 5

Unicode of ellipsis is \u2026.

So you can use \u2026 to match an ellipsis .

Code :

var fulltext= "First sentence… Second sentence. ";
fulltext.match(/([^.?!;\u2026]+[.?!;\u2026]+)/g);

OUTPUT

["First sentence…", " Second sentence."]

DEMO and Explanation

You can just add the ellipsis (and any other punctuation characters) to your character sets.

var input = "First sentence… Second sentence. ";
input.match(/[^\.\?!;…]+[\.\?!;…]+/g);

Result:

["First sentence…", " Second sentence."]
发布评论

评论列表(0)

  1. 暂无评论