最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Using node-validator to strip HTML tags from a string - Stack Overflow

programmeradmin0浏览0评论

I'm looking for a way to strip tags from a string in the backend using NodeJS. A few answers here have suggested trying node-validator but neither the docs nor any answers explain specifically how to use it.

For instance, I have a string in a variable like this:

INPUT:

var text = '<p><b>Hello there!</b> I am a string <span class="small">but not a very exciting one!</span></p>'

DESIRED OUTPUT:

var newText = Hello there! I am a string but not a very exciting one!

The node-validator docs have several options, I think the most pertinent is the trim() function:

var check = require('validator').check,
    sanitize = require('validator').sanitize

//Validate
check('[email protected]').len(6, 64).isEmail();        //Methods are chainable
check('abc').isInt();                                //Throws 'Invalid integer'
check('abc', 'Please enter a number').isInt();       //Throws 'Please enter a number'
check('abcdefghijklmnopzrtsuvqxyz').is(/^[a-z]+$/);

//Sanitize / Filter
var int = sanitize('0123').toInt();                  //123
var bool = sanitize('true').toBoolean();             //true
var str = sanitize(' \t\r hello \n').trim();       //'hello'
var str = sanitize('aaaaaaaaab').ltrim('a');         //'b'
var str = sanitize(large_input_str).xss();
var str = sanitize('&lt;a&gt;').entityDecode();      //'<a>'

Is it possible to use this to strip tags (as well as classes) from a string?

EDIT: I also have cheerio (essentially jquery) loaded and was trying to use something similar to:

HTML
<div class="select">
<p><b>Hello there!</b> I am a string <span class="small">but not a very exciting one!</span></p>
</div>

JAVASCRIPT
(function() {
    var text = $(.select *).each(function() {
        var content = $(this).contents();
        $(this).replaceWith(content);
    }
    );
    return text;
}
());

But this results in an 'Object '<p><b>Hello....' has no method "contents"' error, I'm open to using a similar function if it's easier with jQuery.

I'm looking for a way to strip tags from a string in the backend using NodeJS. A few answers here have suggested trying node-validator but neither the docs nor any answers explain specifically how to use it.

For instance, I have a string in a variable like this:

INPUT:

var text = '<p><b>Hello there!</b> I am a string <span class="small">but not a very exciting one!</span></p>'

DESIRED OUTPUT:

var newText = Hello there! I am a string but not a very exciting one!

The node-validator docs have several options, I think the most pertinent is the trim() function:

var check = require('validator').check,
    sanitize = require('validator').sanitize

//Validate
check('[email protected]').len(6, 64).isEmail();        //Methods are chainable
check('abc').isInt();                                //Throws 'Invalid integer'
check('abc', 'Please enter a number').isInt();       //Throws 'Please enter a number'
check('abcdefghijklmnopzrtsuvqxyz').is(/^[a-z]+$/);

//Sanitize / Filter
var int = sanitize('0123').toInt();                  //123
var bool = sanitize('true').toBoolean();             //true
var str = sanitize(' \t\r hello \n').trim();       //'hello'
var str = sanitize('aaaaaaaaab').ltrim('a');         //'b'
var str = sanitize(large_input_str).xss();
var str = sanitize('&lt;a&gt;').entityDecode();      //'<a>'

Is it possible to use this to strip tags (as well as classes) from a string?

EDIT: I also have cheerio (essentially jquery) loaded and was trying to use something similar to:

HTML
<div class="select">
<p><b>Hello there!</b> I am a string <span class="small">but not a very exciting one!</span></p>
</div>

JAVASCRIPT
(function() {
    var text = $(.select *).each(function() {
        var content = $(this).contents();
        $(this).replaceWith(content);
    }
    );
    return text;
}
());

But this results in an 'Object '<p><b>Hello....' has no method "contents"' error, I'm open to using a similar function if it's easier with jQuery.

Share Improve this question edited Jun 5, 2013 at 10:20 JVG asked Jun 5, 2013 at 10:10 JVGJVG 21.2k48 gold badges140 silver badges215 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 9

I don't use node-validator but something like this works for me

var text = '<p><b>Hello there!</b> I am a string <span class="small">but not a very    exciting one!</span></p>

text.replace(/(<([^>]+)>)/ig,"");

Output

Hello there! I am a string but not a very exciting one!

Now you can trim it with node validator.

Got the code snippet from here

You can get your desired output using the string.js node module. You can install it using node

Here is the code I used -->

var S = require('string');
var text = '<p><b>Hello there!</b> I am a string <span class="small">but not a very exciting one!</span></p>';
console.log(text);
text = S(text).stripTags().s;
console.log(text);

Output-

<p><b>Hello there!</b> I am a string <span class="small">but not a very exciting one!</span></p>
Hello there! I am a string but not a very exciting one!

How to install string.js ?

npm install --save string

Further reference

It doesn't look like node-validator has any sort of HTML tag stripping built in, trim() wouldn't work as it seems you can only specify individual characters to remove. It is very easily extendable so you could write an extension for it to strip out HTML tags.

Otherwise, you could use the cheerio .text()(docs) method to get the combined text contents of an element and its decendants.

Something like this should work:

$('.select *').each(function() {
    var content = $(this).text();
    $(this).replaceWith(content);
}

That will remove any html within a .select, remove the * if you want the .select to be replaced too.

发布评论

评论列表(0)

  1. 暂无评论