最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Replace all instances of anchor tag in a large string - Stack Overflow

programmeradmin2浏览0评论

If I have the following:

content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened."

How would I completely remove the tag altogether so the big string no longer has any anchor tags?

I reached only so far:

var href = content.indexOf("href=\"");
var href1 = content.substring(href).indexOf("\"");

If I have the following:

content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened."

How would I completely remove the tag altogether so the big string no longer has any anchor tags?

I reached only so far:

var href = content.indexOf("href=\"");
var href1 = content.substring(href).indexOf("\"");
Share Improve this question edited Mar 27, 2014 at 3:19 DemCodeLines asked Mar 27, 2014 at 3:15 DemCodeLinesDemCodeLines 1,9208 gold badges42 silver badges62 bronze badges 8
  • What's the desired output, and what tags should be removed (I'm assuming <a> tags with only href attribute)? – Fabrício Matté Commented Mar 27, 2014 at 3:22
  • Any instances of <a> need to be removed, but the text inside them should remain as it is. For example, in the string above, "<a href=\"1\">I</a> was going" should just be "I was going" – DemCodeLines Commented Mar 27, 2014 at 3:26
  • Answer in jQuery: jsfiddle.net/Q3k7L (probably not too hard to rewrite in vanilla JS) – Fabrício Matté Commented Mar 27, 2014 at 3:38
  • While I really appreciate that you created an example for me, I am really looking for pure JS solutions, since I would be able to better understand them. – DemCodeLines Commented Mar 27, 2014 at 3:40
  • Yeah, I was expecting that you wanted a vanilla solution, I only wrote the jQuery one because it was faster -- and IMO, more understandable/faster to scan (once you get a hang of jQuery) than the nested loops and long DOM API names which jQuery abstracts there – Fabrício Matté Commented Mar 27, 2014 at 3:45
 |  Show 3 more comments

4 Answers 4

Reset to default 15

This is why God invented regular expressions, which the string.replace method accepts as the string to replace.

var contentSansAnchors = content.replace(/<\/?a[^>]*>/g, "");

If you're new to regex, some explanation:

/.../: Instead of wrapping the search string in quotes, you wrap it in forward slashes to reflect a regular expression.

<...>: These are literal HTML tag braces.

\/?: The tag may or may not (?) start with a forward slash (\/). The forward slash must be escaped using the backslash or the regex will end prematurely here.

a: Literal anchor tag name.

[^>]*: After the a, the tag may contain zero or more (*) characters that are not (^) a closing brace (>). The "anything but a closing brace" expression is wrapped in square braces ([...]) because it represents a single character.

g: This modifies the regular expression to be global, so that all matches are replaced. Otherwise, only the first match would be replaced.

Depending on what strings you are expecting to parse, you may also want to add the i modifier for case insensitivity.

You can use Regexp to replace all anchor tags.

var result = subject.replace(/<a[^>]*>|<\/a>/g, "");

Strip all tags keeping their text content:

var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";

// parse the HTML string into DOM
var container = document.createElement('div');
container.innerHTML = content;

// retrieve the textContent, or innerText when textContent is not available
var clean = container.textContent || container.innerText;
console.log(clean); //"I was going here and then that happened."

Fiddle

As per OP's comment, the text only contains anchor tags, so this method should work fine.

You may drop the || container.innerText if you don't need IE <= 8 support.

Reference

  • textContent - Gets or sets the text content of a node and its descendents.
  • innerText - Sets or retrieves the text between the start and end tags of the object.

Just to answer the question in the title, here is a way to remove only the anchor elements:

var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";

var container = document.createElement('div');
container.innerHTML = content;

var anchors = container.getElementsByTagName('a'),
    anchor;

while (anchor = anchors[0]) {
    var anchorParent = anchor.parentNode;

    while (anchor.firstChild) {
        anchorParent.insertBefore(anchor.firstChild, anchor);
    }
    anchorParent.removeChild(anchor);
}

var clean = container.innerHTML;
console.log(clean); //"I was going here and then that happened."

Fiddle

Reference

  • Node.insertBefore - Inserts the specified node before a reference element as a child of the current node.
  • Node.removeChild - Removes a child node from the DOM.
  • Element.getElementsByTagName - Returns a list of elements with the given tag name. The subtree underneath the specified element is searched, excluding the element itself.

Even though OP is not using jQuery, here is a practically equivalent jQuery version of the above for whom it may concern:

var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";

var clean = $('<div>').append(content).find('a').contents().unwrap().end().end().html();
console.log(clean); //"I was going here and then that happened."

Fiddle


NOTE

All of the solutions in this answer assume that the content is valid HTML -- it won't handle malformed markup, unclosed tags, etc. It also considers that the markup is safe (XSS-sanitized).

If the criteria above is not met, you're better off using a regex solution. Regex should usually be your last resort when the use case involves parsing HTML as it is very easy to break when tested against arbitrary markup (related: virgin-devouring ponies), but your use case seems very simple and a Regex solution may be just what you need.

This answer provides non-regex solutions so that you may use these once (if ever) a regex solution breaks.

If you could somehow obtain your string in javascript if not dynamic(say you hold it in a var named as "replacedString" in javascript), then in order to fix this you can enclose your entire html content in a div as shown below:-

<div id="stringContent">
  <a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.
</div>

and then your can execute this through jQuery:-

$("#stringContent").empty();
$("#stringContent").html(replacedString);
发布评论

评论列表(0)

  1. 暂无评论