Is there any way to strip HMTL tag with content in HTML
example :
const regexForStripHTML = /(<([^>]+)>)/gi
const text = "OCEP <sup>®</sup> water product"
const stripContent = text.replaceAll(regexForStripHTML, '')
output : 'OCEP ® water product'
I need to remove ®
also from a string
expected output
OCEP water product
Is there any way to strip HMTL tag with content in HTML
example :
const regexForStripHTML = /(<([^>]+)>)/gi
const text = "OCEP <sup>®</sup> water product"
const stripContent = text.replaceAll(regexForStripHTML, '')
output : 'OCEP ® water product'
I need to remove ®
also from a string
expected output
OCEP water product
-
1
Did you mean to type
const stripContent = text.replaceAll(regexForStripHTML, '')
? That might be the issue for replacing the HTML tags. – phentnil Commented Mar 18, 2022 at 5:37 - @phentnil updated to const stripContent = text.replaceAll(regexForStripHTML, '') – Sooraj Jose Commented Mar 18, 2022 at 5:44
- do you need it with any html tags? – Marc Anthony B Commented Mar 18, 2022 at 5:53
4 Answers
Reset to default 3Removing all HTML tags and the innerText can be done with the following snippet. The Regexp captures the opening tag's name, then matches all content between the opening and closing tags, then uses the captured tag name to match the closing tag.
const regexForStripHTML = /<([^</> ]+)[^<>]*?>[^<>]*?<\/\1> */gi;
const text = "OCEP <sup>®</sup> water product";
const stripContent = text.replaceAll(regexForStripHTML, '');
console.log(text);
console.log(stripContent);
This should suffice your use-case:
const regexForStripHTML = /<sup.*>.*?<\/sup>/ig
const text = "OCEP <sup>®</sup> water product"
const stripContent = text.replaceAll(regexForStripHTML, '');
console.log(stripContent);
If you want to do it with any HTML tag. See code below:
const regexForStripHTML = /<.*>.*?/ig
const text = "OCEP <html>®</html> water product"
const stripContent = text.replaceAll(regexForStripHTML, '');
console.log(stripContent);
Context
To remove the text from between the tags you would need to match opening and closing tags of the same tag name. This regex would match the starting tags <(?<tagname>.*?)>
. Notice how tagname
remembers the and is being used for the regex part of the corresponding closing tags which is <\/\k<tagname>>
the part in between .*?
is to match for any text.
Code
const regexForStripHTML = /(<(?<tagname>.*?)>.*?<\/\k<tagname>>)/g
const text = "OCEP <sup>®</sup> water product"
const stripContent = text.replaceAll(regexForStripHTML, '$')
Note
I haven't thought about what happens if the tags are nested.
If you are using JQuery there is a sweet elegant way of doing it:
var res = $('<div>').append(text).text();
The created <div> ensures correct result in case that the text is already stripped