I´m working on a *.po file, I´m trying to catch all the text between msgid ""
and msgstr ""
, not really lucky, never more than one line:
msgid ""
"%s asdfgh asdsfgf asdfg %s even if you "
"asdfgdh sentences with no sense. We are not asking translate "
"Shakespeare's %s Hamlet %s !. %s testing regex %s "
"don't require specific industry knowledge. enjoying "
msgstr ""
What I´ve tried:
var myArray = fileContent.match(/msgid ([""'])(?:(?=(\\?))\2.)*?\1/g);
Thanks for your help, I´m not really good with regex :(
I´m working on a *.po file, I´m trying to catch all the text between msgid ""
and msgstr ""
, not really lucky, never more than one line:
msgid ""
"%s asdfgh asdsfgf asdfg %s even if you "
"asdfgdh sentences with no sense. We are not asking translate "
"Shakespeare's %s Hamlet %s !. %s testing regex %s "
"don't require specific industry knowledge. enjoying "
msgstr ""
What I´ve tried:
var myArray = fileContent.match(/msgid ([""'])(?:(?=(\\?))\2.)*?\1/g);
Thanks for your help, I´m not really good with regex :(
Share Improve this question edited May 31, 2013 at 16:24 Jerry 71.6k14 gold badges104 silver badges146 bronze badges asked May 31, 2013 at 16:22 lfergonlfergon 9731 gold badge15 silver badges27 bronze badges 04 Answers
Reset to default 10Here is one way to extract all of that text:
var match = text.replace(/msgid ""([\s\S]*?)msgstr ""/, "$1");
Example: http://jsfiddle/bqk79/
The [\s\S]
is a character class that will match any character including line breaks, so [\s\S]*?
will match any number of any character. In other languages you could use the s
or DOTALL
flag to make .
match line breaks, but Javascript does not support this.
Note that you regex doesn't make any mention of single quotes, but if you need to be able to match between msgid ''
and msgstr ''
as well you can use the following:
var match = text.replace(/msgid (['"]{2})([\s\S]*?)msgstr \1/, "$2");
Try with this pattern:
/msgid (["']{2})\n([\s\S]*?)\nmsgstr \1/
The result is in the second capturing group, but you can make more simple with:
/msgid ["']{2}\n([\s\S]*?)\nmsgstr /
in the first capturing group
I realize that the question specifically asks for a regular expression, but you should consider using string split instead if you can.
Here is a ready-made function:
function extractTextBetween(subject, start, end) {
try{
return subject.split(start)[1].split(end)[0];
} catch(e){
console.log("Exception when extracting text", e);
}
}
http://jsfiddle/b33hdh9b/3/
You could perhaps try this regex?
msgid ""((?:.|[\n\r])+)msgstr ""
((?:.|[\n\r])+)
this is your catching group;
(?:.|[\n\r])+
This enables the match of .
or [\n\r]
multiple times, the \n\r
are for newlines and carriage returns.
Tested