We all know we can use JSON.parse()
to convert the string '{"a":0,"b":"haha"}'
to the object {a: 0, b: 'haha'}
.
But can we convert the string '{a: 0, b: "haha"}'
to the object {a: 0, b: 'haha'}
?
I'm writing a web crawler and I need to get the data in the page. But the plete data is not in DOM but in one <script>
element. So I got the useful content in the <script>
and converted that string (like 'window.Gbanners = [{...}, {...}, {...}, ...];'
) to a JSON-like string (like '{banners : [...]}'
). However, I couldn't parse the "JSON-like" string. Does anyone have a solution?
We all know we can use JSON.parse()
to convert the string '{"a":0,"b":"haha"}'
to the object {a: 0, b: 'haha'}
.
But can we convert the string '{a: 0, b: "haha"}'
to the object {a: 0, b: 'haha'}
?
I'm writing a web crawler and I need to get the data in the page. But the plete data is not in DOM but in one <script>
element. So I got the useful content in the <script>
and converted that string (like 'window.Gbanners = [{...}, {...}, {...}, ...];'
) to a JSON-like string (like '{banners : [...]}'
). However, I couldn't parse the "JSON-like" string. Does anyone have a solution?
- Possible duplicate of Convert JSON string to array of JSON objects in Javascript – abr Commented Mar 5, 2018 at 14:12
-
Dare I suggest
eval
? (and simply get rid of the first and last'
) – Adam Jenkins Commented Mar 5, 2018 at 14:32 -
Should we trust
eval
? – Li Enze Commented Mar 5, 2018 at 15:09 - Are we fine with executing arbitrary code the scrapper might find on the internet? – Kos Commented Mar 6, 2018 at 9:05
-
@Kos No, we're not. I checked the script that I crawled and found it just did
window.Gbanners=[];
. Then I handled this script string simply andeval
ed it. Maybe I'm fine with it temporarily? -^_^- – Li Enze Commented Mar 6, 2018 at 11:37
3 Answers
Reset to default 8A string like {a: 0, b: "haha"}
is not JSON, but just a bunch of JavaScript code.
Best way to get a JSON representation of data inside is to run it through a JS parser (such as Esprima), traverse the syntax tree and build a json object out of it. This needs some work, but at least you'll have the parsing done correctly, with proper handling of escape sequences.
Here's a starting point:
const esprima = require("esprima");
const code = '({a: 0, b: "haha"})';
const ast = esprima.parse(code);
const properties = ast.body[0].expression.properties;
const output = properties.reduce((result, property) => {
result[property.key.name] = property.value.value;
return result;
}, {});
console.log(output);
This code assumes a lot about what the input code looks like - might be OK for a prototype, but still needs error checking and handling nested objects.
(A more generic approach could involve a recursive function that takes an ObjectExpression
and returns an equivalent JSON.)
I also had to wrap your input in parentheses so that it's an expression (not a block statement) according to JS grammar.
Something like this might work:
function evalJsString(str) {
let a = null;
try {
eval('a = ' + str);
} catch (err) {
console.error(err);
}
if(typeof a === "object")
return a;
else
return null;
}
evalJsString('({a: 0, b: "haha"})');
As eval() has security flaws, it's better not to use it. A possible way would be creating a own parser to convert it to JSON string and then apply JSON.parse(). Something like below
function toJSONString(input) {
const keyMatcher = '([^",{}\\s]+?)';
const valMatcher = '(.,*)';
const matcher = new RegExp(`${keyMatcher}\\s*:\\s*${valMatcher}`, 'g');
const parser = (match, key, value) => `"${key}":${value}`
return input.replace(matcher, parser);
}
JSON.parse(toJSONString('{a: 0, b: "haha"}'))