I'm making a JS "mand line" emulator.
I have Regexp: /([^\s"]+)|"([^\s"]+)"/g
.
I want to match single words, like echo
, wyświetl
, jd923h90asd8
. Also, I want to match "string literals" - something like "this is a string"
or "f82h3 23fhn aj293 dgja3 xcn32"
.
I'm using match
method on input string to get array of all matches. But problem is:
when Regexp matches "string literal" and returns string to array, this string INCLUDES double-quotes. I don't want double-quotes, but the question is - why Regexp includes double-quotes? In the Regexp, quotes ""
are excluded from ()
group. Why Regexp includes it all?
EDIT:
var re = /([^\s"]+)|"([^\s"]+)"/g;
var process = function (text) {
return execute(text.match(re));
}
var execute = function (arr) {
console.log(arr);
try {
//... apply a function with arguments...
} catch (e) {
error(arr[0]+": wrong function");
return "";
}
}
For input echo abc "abc def" "ghi"
Regexp returns array ["echo", "abc", "abc", "def", ""ghi""]
.
I want to make a Regexp, that from that input will return ["echo", "abc", "abc def", "ghi"]
.
I'm making a JS "mand line" emulator.
I have Regexp: /([^\s"]+)|"([^\s"]+)"/g
.
I want to match single words, like echo
, wyświetl
, jd923h90asd8
. Also, I want to match "string literals" - something like "this is a string"
or "f82h3 23fhn aj293 dgja3 xcn32"
.
I'm using match
method on input string to get array of all matches. But problem is:
when Regexp matches "string literal" and returns string to array, this string INCLUDES double-quotes. I don't want double-quotes, but the question is - why Regexp includes double-quotes? In the Regexp, quotes ""
are excluded from ()
group. Why Regexp includes it all?
EDIT:
var re = /([^\s"]+)|"([^\s"]+)"/g;
var process = function (text) {
return execute(text.match(re));
}
var execute = function (arr) {
console.log(arr);
try {
//... apply a function with arguments...
} catch (e) {
error(arr[0]+": wrong function");
return "";
}
}
For input echo abc "abc def" "ghi"
Regexp returns array ["echo", "abc", "abc", "def", ""ghi""]
.
I want to make a Regexp, that from that input will return ["echo", "abc", "abc def", "ghi"]
.
- Can you demonstrate this in action, perhaps a live demo? Or, at the very least, show the code you're using in your question. Your description isn't as clear as it could be, I'm afraid. That, or it's still too early for my brain... – David Thomas Commented Aug 27, 2013 at 9:12
- Ok, I will add some code. – Are Commented Aug 27, 2013 at 9:13
-
It may not be such a bad idea to keep the quotations. Simply strip the double quotes when you need the string contents. Aside from allowing parameters with spaces, it's also a type indicator as well. You may one day decide that parameters without double quotation marks could be variables, in which case it would be necessary to distinguish a string from a possible variable name (in other words, you may want
sort varname
to have a different meaning thansort "varname"
. – Neil Commented Aug 27, 2013 at 9:46
3 Answers
Reset to default 4The quoted part of your regex ("([^\s"]+)"
) doesn't allow spaces within the quote. Try removing the \s
from it. Could also consider using *
instead of +
if you need to match empty strings (""
):
/([^\s"]+)|"([^"]*)"/g
This is the only possible explanation. Even without looking at any code.
Use group(1)
or group(2)
. Not group()
or group(0)
. The later 2 (which are fully equivalent) always return the whole matched string, which in your case includes the quotes. I hope this explains what's going on.
PS: As your RegEx is an "or" RegEx, group(1)
and group(2)
will never have both content at the same time. One, the other, or both will be null or empty. The later when there is no match.
I just realized your are using the match
method to retrieve all matches as an array. In this case, let me say that this method always captures the whole matched strings in each case (the equivalent to group(0)
above). There is no way of telling it to retrieve other groups (like 1 or 2). In consequence, you have 3 alternatives:
- Remove the
"
s from strings with them in the resulting array through some "post-processing". - Do not use JavaScript's
match
method, but create your own equivalent (and usegroup(1)
orgroup(2)
according to the case in it). - Change your regular expression to match the quotes as zero-width positive lookaheads and lookbehinds. Not sure if JavaScript supports this, but it should be
/([^\s"]+)|(?<=")([^\s"]+)(?=")/g
To match JavaScript String literals. Here's what you're looking for:
/(\w+|("|')(.*?)\2)/g
To explain this: you're either looking for unquoted word characters OR matching quotes with anything in between (e.g. quotes should match correctly, for example: "it's his dog"
using regex backreference).
This is simplified to be wary that it does not match escaped a string like:
"my \"plex\" string"
It didn't look like you were worried about that last scenario.
http://regexr./3bdbi