What is the regular expression that Microsoft's .NET Framework uses to perform the standard validation that results in HttpRequestValidationException "A potentially dangerous Request.Form value was detected from the client" when HTML or other potentially unsafe content is posted.
I'd like to have an exact copy of it converted to JavaScript so the user can be alerted early.
My current regular expression (/(&#)|<[^<>]+>/) is close, but not the same as .NET's.
I'm aware this might be different for different .NET versions so specifically I'd like to know:
- A regular expression for .NET 2
- A regular expression for .NET 4
What is the regular expression that Microsoft's .NET Framework uses to perform the standard validation that results in HttpRequestValidationException "A potentially dangerous Request.Form value was detected from the client" when HTML or other potentially unsafe content is posted.
I'd like to have an exact copy of it converted to JavaScript so the user can be alerted early.
My current regular expression (/(&#)|<[^<>]+>/) is close, but not the same as .NET's.
I'm aware this might be different for different .NET versions so specifically I'd like to know:
- A regular expression for .NET 2
- A regular expression for .NET 4
- 1 I very much doubt that regexes are the only ponent implied -- if at all -- to detect that... – fge Commented Jan 5, 2012 at 14:23
2 Answers
Reset to default 5You can use some depilig tool and see for yourself that there no regular expression at all. It calls static method CrossSiteScriptingValidation.IsDangerousString
.
But maybe you can use the Microsoft AntiXSS library to achive the same. Anyway here is the method:
internal static bool IsDangerousString(string s, out int matchIndex)
{
matchIndex = 0;
int num1 = 0;
int num2 = s.IndexOfAny(CrossSiteScriptingValidation.startingChars, num1);
if (num2 < 0)
{
return false;
}
if (num2 == s.Length - 1)
{
return false;
}
matchIndex = num2;
char chars = s.get_Chars(num2);
if ((chars == 38 || chars == 60) && (CrossSiteScriptingValidation.IsAtoZ(s.get_Chars(num2 + 1)) || s.get_Chars(num2 + 1) == 33 || s.get_Chars(num2 + 1) == 47 || s.get_Chars(num2 + 1) == 63))
{
return true;
}
else
{
if (s.get_Chars(num2 + 1) == 35)
{
return true;
}
}
num1 = num2 + 1;
}
I might have answered this in another question here: https://stackoverflow./a/4949339/62054
This regex follows the logic in .NET 4.
/^(?!(.|\n)*<[a-z!\/?])(?!(.|\n)*&#)(.|\n)*$/i
Look in the .NET source for CrossSiteScriptingValidation to find the logic that Microsoft follow. fge is right, it doesn't use a regex, instead it uses some loops and string parisons. I suspect that's for performance.