I have a regular expression task at hand and can really use some help.
Say I have a text like below:
To Sherlock Holmes she is always <i>THE</i> woman.
I need to enclose each character in a span tag, with exception of HTML tags. For example, the text above would be:
<span>T</span><span>o</span><span> </span><span>S</span><span>h</span>
<span>e</span><span>r</span><span>l</span><span>o</span><span>c</span>
<span>k</span><span> </span><span>H</span><span>o</span><span>l</span>
<span>m</span><span>e</span><span>s</span><span> </span><span>s</span>
<span>h</span><span>e</span><span> </span><span>i</span><span>s</span>
<span> </span><span>a</span><span>l</span><span>w</span><span>a</span>
<span>y</span><span>s</span><span> </span><i><span>T</span><span>H</span>
<span>E</span></i><span> </span><span>w</span><span>o</span><span>m</span>
<span>a</span><span>n</span><span>.</span>
Note that:
- each character is enclosed in a span tag, even a space
- HTML tag,
<i></i>
is not
Any suggestion is wele.
Thanks!
I have a regular expression task at hand and can really use some help.
Say I have a text like below:
To Sherlock Holmes she is always <i>THE</i> woman.
I need to enclose each character in a span tag, with exception of HTML tags. For example, the text above would be:
<span>T</span><span>o</span><span> </span><span>S</span><span>h</span>
<span>e</span><span>r</span><span>l</span><span>o</span><span>c</span>
<span>k</span><span> </span><span>H</span><span>o</span><span>l</span>
<span>m</span><span>e</span><span>s</span><span> </span><span>s</span>
<span>h</span><span>e</span><span> </span><span>i</span><span>s</span>
<span> </span><span>a</span><span>l</span><span>w</span><span>a</span>
<span>y</span><span>s</span><span> </span><i><span>T</span><span>H</span>
<span>E</span></i><span> </span><span>w</span><span>o</span><span>m</span>
<span>a</span><span>n</span><span>.</span>
Note that:
- each character is enclosed in a span tag, even a space
- HTML tag,
<i></i>
is not
Any suggestion is wele.
Thanks!
Share Improve this question asked Feb 21, 2011 at 4:44 GrnbeagleGrnbeagle 1,7612 gold badges16 silver badges26 bronze badges 4- Do you know how much markup will be present? Specifically, do you know if you'll be dealing with nested html tags, like <b><i>awesomesauce</i></b>? The presence of nested tags makes it a considerably harder problem. – Benson Commented Feb 21, 2011 at 4:46
- rest assured, its not a regular expression task :) – Anurag Commented Feb 21, 2011 at 4:51
- Nested tags is quite possible, but initially we can assume no if it makes it easier. – Grnbeagle Commented Feb 21, 2011 at 4:53
- Why do you need this? It's quite likely there's a better solution. – Sophie Alpert Commented Feb 21, 2011 at 4:58
3 Answers
Reset to default 5This job is better handled by DOM interactions. The following two utility functions will work help wrapping each character in the given text with a span tag.
/**
* recursively get all text nodes as an array for a given element
*/
function getTextNodes(node) {
var childTextNodes = [];
if (!node.hasChildNodes()) {
return;
}
var childNodes = node.childNodes;
for (var i = 0; i < childNodes.length; i++) {
if (childNodes[i].nodeType == Node.TEXT_NODE) {
childTextNodes.push(childNodes[i]);
}
else if (childNodes[i].nodeType == Node.ELEMENT_NODE) {
Array.prototype.push.apply(childTextNodes, getTextNodes(childNodes[i]));
}
}
return childTextNodes;
}
/**
* given a text node, wrap each character in the
* given tag.
*/
function wrapEachCharacter(textNode, tag) {
var text = textNode.nodeValue;
var parent = textNode.parentNode;
var characters = text.split('');
characters.forEach(function(character) {
var element = document.createElement(tag);
var characterNode = document.createTextNode(character);
element.appendChild(characterNode);
parent.insertBefore(element, textNode);
});
parent.removeChild(textNode);
}
Now given some piece of HTML, we will create a DOM representation of it, and then retrieve all text nodes from it using the first function - getTextNodes
. Once we have all the text nodes, we can pass each one of them to the second function - wrapEachCharacter
.
// create a wrapper element that will hold our HTML.
var container = document.createElement('div');
container.innerHTML = "To Sherlock Holmes she is always <i>THE</i> woman.";
// get all text nodes recursively.
var allTextNodes = getTextNodes(container);
// wrap each character in each text node thus gathered.
allTextNodes.forEach(function(textNode) {
wrapEachCharacter(textNode, 'span');
});
An example is posted here.
Something along this line should do the trick
txt = txt.replace (/(<.*?>)|(.)/g, function (m0, tag, ch) {
return tag || ('<span>' + ch + '</span>');
});
Don't use a regex, just loop over the string using a for loop:
var s = 'To Sherlock Holmes she is always <i>THE</i> woman.';
var out = '';
for (var z = 0; z < s.length; ++z) {
var ch = s.charAt(z);
if (ch == '<') {
while (ch != '>') {
out += ch;
ch = s.charAt(++z);
}
out += ch;
continue;
}
out += '<span>' + ch + '</span>';
}
alert(out);