最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

regex - regular expression (javascript) How to match anything beween two tags any number of times - Stack Overflow

programmeradmin2浏览0评论

I'm trying to find all occurrences of items in HTML page that are in between <nobr> and </nobr> tags. EDIT:(nobr is an example. I need to find content between random strings, not always tags)

I tried this

var match = /<nobr>(.*?)<\/nobr>/img.exec(document.documentElement.innerHTML);
alert (match);

But it gives only one occurrence. + it appears twice, once with the <nobr></nobr> tags and once without them. I need only the version without the tags.

I'm trying to find all occurrences of items in HTML page that are in between <nobr> and </nobr> tags. EDIT:(nobr is an example. I need to find content between random strings, not always tags)

I tried this

var match = /<nobr>(.*?)<\/nobr>/img.exec(document.documentElement.innerHTML);
alert (match);

But it gives only one occurrence. + it appears twice, once with the <nobr></nobr> tags and once without them. I need only the version without the tags.

Share Improve this question edited May 18, 2009 at 14:56 Nir asked May 18, 2009 at 14:40 NirNir 25.4k26 gold badges84 silver badges119 bronze badges 2
  • What result do you get if you do string.match(regex) rather than regex.exec(string) ? – nickf Commented May 18, 2009 at 14:50
  • your question is about global submatches in javascript - (stackoverflow./questions/844001/…) – Chad Commented Aug 2, 2014 at 14:24
Add a ment  | 

6 Answers 6

Reset to default 5

you need to do it in a loop

var match, re = /<nobr>(.*?)<\/nobr>/img;
while((match = re.exec(document.documentElement.innerHTML)) !== null){
   alert(match[1]);
}

use the DOM

var nobrs = document.getElementsByTagName("nobr")

and you can then loop through all nobrs and extract the innerHTML or apply any other action on them.

(Since I can't ment on Rafael's correct answer...)

exec is doing what it is supposed to do - finding the first match, returning the result in the match object, and setting you up for the next exec call. The match object contains (at index 0) the whole of the string matched by the whole of the regex. In subsequent slots are the bits of the string matched by the parenthesized subgroups. So match[1] contains the bit of the string matched by "(.*?)" in your example.

you can use

while (match = /<nobr>(.*?)<\/nobr>/img.exec("foo <nobr> hello </nobr> bar <nobr> world </nobr> foobar"))
    alert (match[1]);

If the strings you're using aren't xml elements, and you're sticking with regexes the return value you're getting can be explained by the bracketing. .exec returns the whole matching string followed by the contents of the bracketed expressions.

If your doc contains:

This is out.
Bzz. This is in. unBzz.

then

/Bzz.(.*?)unBzz./img.exec(document.documentElement.innerHTML)

Will give you 'Bzz. This is in. unBzz.' in element 0 of the returned array and 'This is in.' in element 1. Trying to display the whole array gives both as a ma separated list because that's what JavaScript does to try to display it.

So alert($match[1]); is what you're after.

it takes to steps but you could do it like this

match = document.documentElement.innerHTML.match(/<nobr>(.*?)<\/nobr>/img)
alert(match)//includes '<nobr>'

match_length = match.length;
for (var i = 0; i < match_length; i++)
{
    var match2 = match[i].match(/<nobr>(.*?)<\/nobr>/im);//same regex without the g option
    alert(match2[1]);
}

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论