If I have the following:
content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened."
How would I completely remove the tag altogether so the big string no longer has any anchor tags?
I reached only so far:
var href = content.indexOf("href=\"");
var href1 = content.substring(href).indexOf("\"");
If I have the following:
content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened."
How would I completely remove the tag altogether so the big string no longer has any anchor tags?
I reached only so far:
var href = content.indexOf("href=\"");
var href1 = content.substring(href).indexOf("\"");
Share
Improve this question
edited Mar 27, 2014 at 3:19
DemCodeLines
asked Mar 27, 2014 at 3:15
DemCodeLinesDemCodeLines
1,9208 gold badges42 silver badges62 bronze badges
8
|
Show 3 more comments
4 Answers
Reset to default 15This is why God invented regular expressions, which the string.replace
method accepts as the string to replace.
var contentSansAnchors = content.replace(/<\/?a[^>]*>/g, "");
If you're new to regex, some explanation:
/
.../
: Instead of wrapping the search string in quotes, you wrap it in forward slashes to reflect a regular expression.
<
...>
: These are literal HTML tag braces.
\/?
: The tag may or may not (?
) start with a forward slash (\/
). The forward slash must be escaped using the backslash or the regex will end prematurely here.
a
: Literal anchor tag name.
[^>]*
: After the a
, the tag may contain zero or more (*
) characters that are not (^
) a closing brace (>
). The "anything but a closing brace" expression is wrapped in square braces ([
...]
) because it represents a single character.
g
: This modifies the regular expression to be global, so that all matches are replaced. Otherwise, only the first match would be replaced.
Depending on what strings you are expecting to parse, you may also want to add the i
modifier for case insensitivity.
You can use Regexp to replace all anchor tags.
var result = subject.replace(/<a[^>]*>|<\/a>/g, "");
Strip all tags keeping their text content:
var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";
// parse the HTML string into DOM
var container = document.createElement('div');
container.innerHTML = content;
// retrieve the textContent, or innerText when textContent is not available
var clean = container.textContent || container.innerText;
console.log(clean); //"I was going here and then that happened."
Fiddle
As per OP's comment, the text only contains anchor tags, so this method should work fine.
You may drop the || container.innerText
if you don't need IE <= 8 support.
Reference
textContent
- Gets or sets the text content of a node and its descendents.innerText
- Sets or retrieves the text between the start and end tags of the object.
Just to answer the question in the title, here is a way to remove only the anchor elements:
var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";
var container = document.createElement('div');
container.innerHTML = content;
var anchors = container.getElementsByTagName('a'),
anchor;
while (anchor = anchors[0]) {
var anchorParent = anchor.parentNode;
while (anchor.firstChild) {
anchorParent.insertBefore(anchor.firstChild, anchor);
}
anchorParent.removeChild(anchor);
}
var clean = container.innerHTML;
console.log(clean); //"I was going here and then that happened."
Fiddle
Reference
Node.insertBefore
- Inserts the specified node before a reference element as a child of the current node.Node.removeChild
- Removes a child node from the DOM.Element.getElementsByTagName
- Returns a list of elements with the given tag name. The subtree underneath the specified element is searched, excluding the element itself.
Even though OP is not using jQuery, here is a practically equivalent jQuery version of the above for whom it may concern:
var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";
var clean = $('<div>').append(content).find('a').contents().unwrap().end().end().html();
console.log(clean); //"I was going here and then that happened."
Fiddle
NOTE
All of the solutions in this answer assume that the content
is valid HTML -- it won't handle malformed markup, unclosed tags, etc. It also considers that the markup is safe (XSS-sanitized).
If the criteria above is not met, you're better off using a regex solution. Regex should usually be your last resort when the use case involves parsing HTML as it is very easy to break when tested against arbitrary markup (related: virgin-devouring ponies), but your use case seems very simple and a Regex solution may be just what you need.
This answer provides non-regex solutions so that you may use these once (if ever) a regex solution breaks.
If you could somehow obtain your string in javascript if not dynamic(say you hold it in a var named as "replacedString" in javascript), then in order to fix this you can enclose your entire html content in a div as shown below:-
<div id="stringContent">
<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.
</div>
and then your can execute this through jQuery:-
$("#stringContent").empty();
$("#stringContent").html(replacedString);
<a>
tags with onlyhref
attribute)? – Fabrício Matté Commented Mar 27, 2014 at 3:22"<a href=\"1\">I</a> was going"
should just be"I was going"
– DemCodeLines Commented Mar 27, 2014 at 3:26