最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How should untrusted JSON be sanitized before using JSON.parse? - Stack Overflow

programmeradmin2浏览0评论

Given a user-provided JSON string, how can we sanitize it before running JSON.parse(untrustedString)?

My primary concern is about prototype pollution, but I'm also wondering what else I should potentially look out for? If it's just prototype pollution that's a risk, then I assume that could be handled via regex, but I suspect there are additional concerns?

For example, this article on the dangers of parsing untrusted JSON and then creating a copy of the object.:

Now consider some malicious JSON data sent to this endpoint.

{
  "user": {
    "__proto__": {
      "admin": true
    }
  }
} 

If this JSON is sent, JSON.parse will produce an object with a __proto__ property. If the copying library works as described above, it will copy the admin property onto the prototype of req.session.user!

Given a user-provided JSON string, how can we sanitize it before running JSON.parse(untrustedString)?

My primary concern is about prototype pollution, but I'm also wondering what else I should potentially look out for? If it's just prototype pollution that's a risk, then I assume that could be handled via regex, but I suspect there are additional concerns?

For example, this article on the dangers of parsing untrusted JSON and then creating a copy of the object.:

Now consider some malicious JSON data sent to this endpoint.

{
  "user": {
    "__proto__": {
      "admin": true
    }
  }
} 

If this JSON is sent, JSON.parse will produce an object with a __proto__ property. If the copying library works as described above, it will copy the admin property onto the prototype of req.session.user!

Share Improve this question edited Sep 16, 2020 at 21:22 trincot 350k36 gold badges271 silver badges322 bronze badges asked Sep 16, 2020 at 19:19 SlboxSlbox 13.1k16 gold badges64 silver badges133 bronze badges 5
  • 3 json should pretty much always be safe to parse. – Daniel A. White Commented Sep 16, 2020 at 19:20
  • That was my initial impression, but the more I read the more paranoid I became. Can't JSON.parse() allow pollution of the object prototype? – Slbox Commented Sep 16, 2020 at 19:21
  • 1 i dont think it can - but it could create a malformed object – Daniel A. White Commented Sep 16, 2020 at 19:22
  • I've added an article with and example of one of my concerns to the question text. – Slbox Commented Sep 16, 2020 at 19:25
  • 2 ah thats just saying additional properties can come thru __proto__ and you should guard against that - nothing that JSON.parse is doing wrong... – Daniel A. White Commented Sep 16, 2020 at 19:28
Add a comment  | 

3 Answers 3

Reset to default 11

My primary concern is about prototype pollution

Note that JSON.parse will not pollute any prototype object. If the JSON string has a "__proto__" key, then that key will be created just as any other key, and what ever the corresponding value is in that JSON, it will end up as that property value, and not in the prototype object (Object.prototype).

The risk is in what you do with that object afterwards. If you perform a (deep) copy, with property assignments or Object.assign, then you would potentially mutate the prototype object.

how can we sanitize it before running JSON.parse(untrustedString)? ... I assume that could be handled via regex

Do not use a regular expression for that. Use the second argument of JSON.parse:

const cleaner = (key, value) => key === "__proto__" ? undefined : value;

// demo
let json = '{"user":{"__proto__":{"admin": true}}}';

console.log(JSON.parse(json));
console.log(JSON.parse(json, cleaner));

Before you do anything with it, userString is just a string, and nothing in that string, by itself, can harm a system, unless the system does something to allow that harm, like processing it in an unsafe way.

Enter JSON.parse().

JSON.parse() is simply a format conversion tool. It doesn't run any methods from the data (which proto pollution exploits rely on), or really even look at the data contained in the stringified object itself, beyond structural syntax it contains, and JavaScript reserved words, for validation purposes (MDN polyfill example). The same principle applies here as with the string; if you don't do anything unsafe with the output object, it can't hurt you or your system.

At the end of the day, preventing abuse simply boils down to validation and safe data handling practices:

  • Check the object that results from parsing the string, and validate it within strict parameters, ignoring prototype mutations.

  • Use Object.prototype.hasOwnProperty.call(). Enforce these practices in your code base with tools like eslint (no-prototype-builtins rule).

  • Don't expect users to send in perfect, safe data.

In the article you linked, the author mentions this exact idea:

...data coming in from users should always be filtered and sanitized.

Probably the most important thing is to simply limit the size. Memory overflows can reach beyond your applications sandbox.

The next most important thing is to concern yourself with a character set and sanitize stuff that you're not going to work with.

If you're not expecting full Unicode support, explicitly don't support it (filter to something simpler like ASCII, after escapes are processed) because that code is complex and hits binary layers that could be corrupted

Finally, have an explicit list of keys that you support and validate the values in each one of them so that you don't have to worry about your application logic going astray.

发布评论

评论列表(0)

  1. 暂无评论