The jQuery form .serialize() method serializes form contents to a string and automatically URL-encodes the string. My server then reverses this process and URL-decodes the string when deserializing it.
But what I need to be able to do is to HTML-encode the form contents before the form is serialized. In other words, if a user enters HTML into a text input in my form, I want this to be made safe using HTML encoding, then transmitted exactly as described above (using URL-encoding as normal).
Let me illustrate with an example:
Current implementation using .serialize()
- User enters
My name is <b>Fred</b>
into a form input with nameDetails
..serialize()
serializes this asDetails=My+name+is+%3Cb%3EFred%3C%2Fb%3E
(URL-encoding)- The server deserializes the string and gets
My name is <b>Fred</b>
(URL-decoding)
What I want to happen
- User enters
My name is <b>Fred</b>
into a form input with nameDetails
.- This gets HTML-encoded to
My name is <b>Fred</b>
(HTML-encoding).serialize()
serializes this asDetails=My+name+is+%26lt%3Bb%26gt%3BFred%26lt%3B%2Fb%26gt%3B
(URL-encoding)- The server URL-decodes the string and gets
My name is <b>Fred</b>
(URL-decoding only)
I was hoping that .serialize()
might take an argument to specify that the form contents should be HTML-encoded, but no such luck. A couple of other possible solutions would be:
- Iterate through the form inputs and HTML-encode them "by hand" before calling
.serialize()
: I'd rather not have to do this as it will make the code messier and less robust. - Modify my server to accept non-HTML-encoded values: for various reasons I won't go into this is problematic and not a practical solution.
Is there a simpler solution?
The jQuery form .serialize() method serializes form contents to a string and automatically URL-encodes the string. My server then reverses this process and URL-decodes the string when deserializing it.
But what I need to be able to do is to HTML-encode the form contents before the form is serialized. In other words, if a user enters HTML into a text input in my form, I want this to be made safe using HTML encoding, then transmitted exactly as described above (using URL-encoding as normal).
Let me illustrate with an example:
Current implementation using .serialize()
- User enters
My name is <b>Fred</b>
into a form input with nameDetails
..serialize()
serializes this asDetails=My+name+is+%3Cb%3EFred%3C%2Fb%3E
(URL-encoding)- The server deserializes the string and gets
My name is <b>Fred</b>
(URL-decoding)
What I want to happen
- User enters
My name is <b>Fred</b>
into a form input with nameDetails
.- This gets HTML-encoded to
My name is <b>Fred</b>
(HTML-encoding).serialize()
serializes this asDetails=My+name+is+%26lt%3Bb%26gt%3BFred%26lt%3B%2Fb%26gt%3B
(URL-encoding)- The server URL-decodes the string and gets
My name is <b>Fred</b>
(URL-decoding only)
I was hoping that .serialize()
might take an argument to specify that the form contents should be HTML-encoded, but no such luck. A couple of other possible solutions would be:
- Iterate through the form inputs and HTML-encode them "by hand" before calling
.serialize()
: I'd rather not have to do this as it will make the code messier and less robust. - Modify my server to accept non-HTML-encoded values: for various reasons I won't go into this is problematic and not a practical solution.
Is there a simpler solution?
Share Improve this question asked Apr 16, 2015 at 8:25 Mark WhitakerMark Whitaker 8,6059 gold badges47 silver badges70 bronze badges 6- 1 just to clarify, you can't encode on the server side? Someone could disable JS and submit the form (if it works) – uv_man Commented Apr 16, 2015 at 8:34
-
1
You can use
serializeArray
instead and process the resulting array. (I'm not sure, but I don't think serializeArray encodes the values) – Felix Kling Commented Apr 16, 2015 at 8:34 - This usually requires a server-side solution. ASP.Net/MVC does this automatically (refuses unsafe data unless specifically allowed). Are you PHP or .Net based? – iCollect.it Ltd Commented Apr 16, 2015 at 8:36
- @uv_man No, see my ment 2 above. The site is inoperative without JavaScript, so the scenario you describe (while usually valid) isn't a concern here. – Mark Whitaker Commented Apr 16, 2015 at 8:38
- @TrueBlueAussie I'm using MVC and you're right, it is rejecting un-escaped HTML during deserialization. That's the problem! – Mark Whitaker Commented Apr 16, 2015 at 8:39
5 Answers
Reset to default 2The solution is to use jQuery's .serializeArray()
and apply the HTML-encoding to each element in a loop.
In other words, I had to change this:
$.ajax({
url: form.attr('action'),
async: false,
type: 'POST',
data: form.serialize(),
success: function (data) {
//...
}
});
to this:
// HTML-encode form values before submitting
var data = {};
$.each(form.serializeArray(), function() {
data[this.name] = this.value
.replace(/&/g, '&')
.replace(/"/g, '"')
.replace(/'/g, ''')
.replace(/</g, '<')
.replace(/>/g, '>');
});
$.ajax({
url: form.attr('action'),
async: false,
type: 'POST',
data: data,
success: function (data) {
//...
}
});
As you are using MVC (see ments), simply apply the [AllowHtml]
attribute above the single property that requires it.
You will need to add the following using
statement if not already present:
using System.Web.Mvc;
Note: If you are also using a MetadataTypeAttribute
it may not work out of the box (but unlikely to be a problem in this case)
Update
From ments, as you cannot modify the form data properties (dynamic forms), you can turn it off in the controller using the following on the controller action
[ValidateInput(false)]
You can also change the setting for the entire server (less secure). See this blog entry:
http://weblogs.asp/imranbaloch/handling-validateinputattribute-globally
Input values will always get encoded by default. As you stated, you have to iterate through each values to decode first. You can use the following jQuery snippet to do that:
$('<div/>').html(value).text();
One option might be to update the jquery library directly and call htmlEncode on the dom value, before the uriEncode happens.
I tested this in a ASP.NET/MVC app and the line I updated in jquery-1.8.2.js (line 7222, depending version) was:
s[ s.length ] = encodeURIComponent( key ) + "=" + encodeURIComponent( value );
to
s[ s.length ] = encodeURIComponent( key ) + "=" + encodeURIComponent( htmlEncode(value) );
Use whichever htmlEncode method you find suitable, but it appears to work.
Might actually make more sense to extend this method out and call a customSerialize method which does the htmlEncode.
I believe this is the simplest way and means you don't have to iterate through the dom before calling serialize.
A string has to be html-encoded after any other changes like url-encoding or the sql string-escape.
So you first serialize your string, use it in links and after deserializing you html-encode it. Just do it as before but use the function below.
Why is this important?
Because I can enter in the url myself a non-html-escaped string and then can give it to you. You would think it's escaped, but it wouldn't. The solution is to escape it just before printing it on the page.
This question describes how to html-escape a string: HtmlSpecialChars equivalent in Javascript?
function escapeHtml(text) {
var map = {
'&': '&',
'<': '<',
'>': '>',
'"': '"',
"'": '''
};
return text.replace(/[&<>"']/g, function(m) { return map[m]; });
}