I am having some trouble using JSON.parse on certain characters. I'm receiving this data via an API, I don't have the means to force any form of encoding on the server side, this data is provided to me as-is.
This is the json in question:
{"name": "»»»»»»»"}
I created a jsfiddle with the json data and the basic JSON.parse function which returns "Unexpected token in JSON at position 11". (there are some special characters in there that you probably won't see in your browser, jsfiddle will show them)
/
How would I go about fixing this string prior to doing JSON.parse on it, without losing the special characters?
EDIT: modified jsfiddle and json to only contain the string causing trouble, so it's less confusing for everyone.
I am having some trouble using JSON.parse on certain characters. I'm receiving this data via an API, I don't have the means to force any form of encoding on the server side, this data is provided to me as-is.
This is the json in question:
{"name": "»»»»»»»"}
I created a jsfiddle with the json data and the basic JSON.parse function which returns "Unexpected token in JSON at position 11". (there are some special characters in there that you probably won't see in your browser, jsfiddle will show them)
https://jsfiddle/4u1LtvLm/2/
How would I go about fixing this string prior to doing JSON.parse on it, without losing the special characters?
EDIT: modified jsfiddle and json to only contain the string causing trouble, so it's less confusing for everyone.
Share Improve this question edited Jul 25, 2017 at 19:57 ruinernix asked Jul 25, 2017 at 18:55 ruinernixruinernix 6902 gold badges8 silver badges21 bronze badges 10- what char is at 423? – Daniel A. White Commented Jul 25, 2017 at 18:56
- 1 that JSON worked just fine for me. – Daniel A. White Commented Jul 25, 2017 at 18:57
- Looks like it's the "»»»»»»»" – Jordan Kasper Commented Jul 25, 2017 at 18:57
- 1 In the fiddle, chances are it's the ones that light up like a christmas three, because they are invalid – adeneo Commented Jul 25, 2017 at 18:58
- Worked fine for me in the console in Chrome. – Jordan Kasper Commented Jul 25, 2017 at 18:58
3 Answers
Reset to default 3My solution was this: https://stackoverflow./a/40558081/370709
function escapeUnicode(str) {
return str.replace(/[^\0-~]/g, function(ch) {
return "\\u" + ("0000" + ch.charCodeAt().toString(16)).slice(-4);
});
}
Problem solved!
JSON.parse needs to get a string which consists only of unicode characters (see Json parsing with unicode characters).
For you the JSON.parse method fails, because your string contains non-unicode characters. If you paste your string into http://jsonparseronline./ you will see that it fails because of the character, which is the character the browser displays if the string is not correctly encoded.
So, if you don't have a way to change the endcoding of your string, you won't be able to do this. You can try something like this to change the encoding, but ti give a definite answer you would need to know how your string is encoded in the first place
The problem at position 423 is this character:
»
This is not a standard ASCII character. JSON has some restrictions (UTF-8) on its content, you should be able to have a character as this in a valid JSON string. But as it seems you must escape it properly.
I would convert the string by replacing those non-ASCII characters (UTF-8 surrogates) to their escaped version (such as \x0382
and similar). Only then churn it through the JSON parser and finally expect the data to contain those escape characters.
Based on how you consume them, they may already be well-formed or require to be back-converted into UTF-8 surrogates.
EDIT: valid JSON text should in fact be UTF-8, but that's the standard. It is possible that a lousy non-standard implementation of a parser does not honor this restriction and require ASCII, instead. Which obviously means that there's a lake of tears ahead in using it.
EDIT 2: Oh, wait. This is on node.js? Well, that's not a lousy implementation at all, in fact it's one of the best (fastest and robust) I've ever e across... Consider converting to ASCII only as a last resort. If possible, identify the true culprit and solve the problem without conversion. As long as it is a UTF-8 string it should work right out of the box. If it's a UNICODE string, convert it to UTF-8 (not ASCII... forget about ASCII... node.js should work perfectly with UTF-8).
BTW, by posting the string on the web you intrinsically loose the encoding and force it to UTF-8, which may be the reason why we cannot reproduce your problem.
EDIT 3: If in doubt, use this encoder.