I'm trying to find all occurrences of items in HTML page that are in between <nobr>
and </nobr>
tags.
EDIT:(nobr is an example. I need to find content between random strings, not always tags)
I tried this
var match = /<nobr>(.*?)<\/nobr>/img.exec(document.documentElement.innerHTML);
alert (match);
But it gives only one occurrence. + it appears twice, once with the <nobr></nobr>
tags and once without them. I need only the version without the tags.
I'm trying to find all occurrences of items in HTML page that are in between <nobr>
and </nobr>
tags.
EDIT:(nobr is an example. I need to find content between random strings, not always tags)
I tried this
var match = /<nobr>(.*?)<\/nobr>/img.exec(document.documentElement.innerHTML);
alert (match);
But it gives only one occurrence. + it appears twice, once with the <nobr></nobr>
tags and once without them. I need only the version without the tags.
- What result do you get if you do string.match(regex) rather than regex.exec(string) ? – nickf Commented May 18, 2009 at 14:50
- your question is about global submatches in javascript - (stackoverflow./questions/844001/…) – Chad Commented Aug 2, 2014 at 14:24
6 Answers
Reset to default 5you need to do it in a loop
var match, re = /<nobr>(.*?)<\/nobr>/img;
while((match = re.exec(document.documentElement.innerHTML)) !== null){
alert(match[1]);
}
use the DOM
var nobrs = document.getElementsByTagName("nobr")
and you can then loop through all nobrs and extract the innerHTML or apply any other action on them.
(Since I can't ment on Rafael's correct answer...)
exec
is doing what it is supposed to do - finding the first match, returning the result in the match
object, and setting you up for the next exec
call. The match
object contains (at index 0) the whole of the string matched by the whole of the regex. In subsequent slots are the bits of the string matched by the parenthesized subgroups. So match[1]
contains the bit of the string matched by "(.*?)" in your example.
you can use
while (match = /<nobr>(.*?)<\/nobr>/img.exec("foo <nobr> hello </nobr> bar <nobr> world </nobr> foobar"))
alert (match[1]);
If the strings you're using aren't xml elements, and you're sticking with regexes the return value you're getting can be explained by the bracketing. .exec returns the whole matching string followed by the contents of the bracketed expressions.
If your doc contains:
This is out.
Bzz. This is in. unBzz.
then
/Bzz.(.*?)unBzz./img.exec(document.documentElement.innerHTML)
Will give you 'Bzz. This is in. unBzz.' in element 0 of the returned array and 'This is in.' in element 1. Trying to display the whole array gives both as a ma separated list because that's what JavaScript does to try to display it.
So
alert($match[1]);
is what you're after.
it takes to steps but you could do it like this
match = document.documentElement.innerHTML.match(/<nobr>(.*?)<\/nobr>/img)
alert(match)//includes '<nobr>'
match_length = match.length;
for (var i = 0; i < match_length; i++)
{
var match2 = match[i].match(/<nobr>(.*?)<\/nobr>/im);//same regex without the g option
alert(match2[1]);
}