How would I write use regexp to split a string like so:
"foo bar, foobar (bar)" => ["foo", "bar", ",", "foobar", "(", "bar", ")"]
i.e. split on whitespace + each special character should separately get included in the resulting array
How would I write use regexp to split a string like so:
"foo bar, foobar (bar)" => ["foo", "bar", ",", "foobar", "(", "bar", ")"]
i.e. split on whitespace + each special character should separately get included in the resulting array
Share Improve this question edited Jul 20, 2013 at 16:55 Casimir et Hippolyte 89.9k5 gold badges100 silver badges130 bronze badges asked Jul 20, 2013 at 16:50 Himanshu PHimanshu P 9,9456 gold badges39 silver badges46 bronze badges 3- What language do you use? – Casimir et Hippolyte Commented Jul 20, 2013 at 16:53
- I am using JavaScript – Himanshu P Commented Jul 20, 2013 at 16:54
- What output are you looking to obtain? It would help if you didn't repeat foo-bar constantly; use apples, pears, bananas or whatever. – Andy G Commented Jul 20, 2013 at 16:55
4 Answers
Reset to default 4Instead of splitting, I'd do the reverse; find all matches of \w+|[^\w\s]
.
I would probably do something like this
var foo = [];
"foo bar, foobar (bar)".split(/(\W)/).forEach(function(elem) {
if (!/^\s*$/.test(elem)) {
foo.push(elem);
}
});
// foo = (object) ['foo', 'bar', ',', 'foobar', '(', 'bar', ')']
The new array "foo" will contain all your values.
I came up with (\w+|[,()])
per http://rubular./r/BGAFLOmkgP
You can split on the following regex:
"\s|(?=\W)"
Well, this one will give you following output:
["foo", "bar", ",", "foobar", "", "(bar", ")"]
Since, Javascript doesn't support look-behinds, it wouldn't be possible here to split to break (bar
into two separate elements.
A better way would be to match instead of split. All you want is to get all the substrings matching the following regex pattern:
"\w+|[^\w\s]"
To use _
as special characters, you can use:
"[^_\W]+|[^a-zA-Z0-9\s]"