最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

jquery - Javascript Regular expression to remove unwanted <br>,   - Stack Overflow

programmeradmin1浏览0评论

I have a JS stirng like this
&lt;div id="grouplogo_nav"&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;ul&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;li&gt;&lt;a class="group_hlfppt" target="_blank" href="/"&gt;&amp;nbsp;&lt;/a&gt;&lt;/li&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/ul&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/div&gt;

I need to remove all <br> and $nbsp; that are only between &gt; and &lt;. I tried to write a regular expression, but didn't got it right. Does anybody have a solution.

EDIT :

Please note i want to remove only the tags b/w &gt; and &lt;

I have a JS stirng like this
&lt;div id="grouplogo_nav"&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;ul&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;li&gt;&lt;a class="group_hlfppt" target="_blank" href="http://www.hlfppt/"&gt;&amp;nbsp;&lt;/a&gt;&lt;/li&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/ul&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/div&gt;

I need to remove all <br> and $nbsp; that are only between &gt; and &lt;. I tried to write a regular expression, but didn't got it right. Does anybody have a solution.

EDIT :

Please note i want to remove only the tags b/w &gt; and &lt;

Share Improve this question edited Oct 11, 2012 at 12:13 Nandakumar V asked Oct 11, 2012 at 11:51 Nandakumar VNandakumar V 4,6355 gold badges32 silver badges49 bronze badges 1
  • Be careful trying to parse HTML with javascript, it may be detrimental to your health: stackoverflow./a/1732454/36537 – Phil H Commented Oct 11, 2012 at 12:19
Add a ment  | 

6 Answers 6

Reset to default 4

Avoid using regex on html!

Try creating a temporary div from the string, and using the DOM to remove any br tags from it. This is much more robust than parsing html with regex, which can be harmful to your health:

var tempDiv = document.createElement('div');
tempDiv.innerHTML = mystringwithBRin;
var nodes = tempDiv.childNodes;
for(var nodeId=nodes.length-1; nodeId >= 0; --nodeId) {
    if(nodes[nodeId].tagName === 'br') {
        tempDiv.removeChild(nodes[nodeId]);
    }
}
var newStr = tempDiv.innerHTML;

Note that we iterate in reverse over the child nodes so that the node IDs remain valid after removing a given child node.

http://jsfiddle/fxfrt/

myString = myString.replace(/^(&nbsp;|<br>)+/, '');

... where /.../ denotes a regular expression, ^ denotes start of string, ($nbsp;|<br>) denotes "&nbsp; or <br>", and + denotes "one or more occurrence of the previous expression". And then simply replace that full match with an empty string.

s.replace(/(&gt;)(?:&nbsp;|<br>)+(\s?&lt;)/g,'$1$2');

Don't use this in production. See the answer from Phil H.

Edit: I try to explain it a bit and hope my english is good enough.

Basically we have two different kinds of parentheses here. The first pair and third pair () are normal parentheses. They are used to remember the characters that are matched by the enclosed pattern and group the characters together. For the second pair, we don't need to remember the characters for later use, so we disable the "remember" functionality by using the form (?:) and only group the characters to make the + work as expected. The + quantifier means "one or more occurrences", so &nbsp; or <br> must be there one or more times. The last part (\s?&lt;) matches a whitespace character (\s), which can be missing or occur one time (?), followed by the characters &lt;. $1 and $2 are kind of variables that are replaces by the remembered characters of the first and third parentheses.

MDN provides a nice table, which explains all the special characters.

You need to replace globally. Also don't forget that you can have the
being closed
. Try this:

myString = myString.replace(/(&nbsp;|<br>|<br \/>)/g, '');

This worked for me, please note for the multi lines

myString = myString.replace(/(&nbsp;|<br>|<br \/>)/gm, '');
myString = myString.replace(/^(&nbsp;|<br>)+/, '');

hope this helps

发布评论

评论列表(0)

  1. 暂无评论