Is it possible to determine whether or not a javascript function is "pure", using javascript?
A function is said to be pure when it behaves in a predictable way, in the sense that for each x, the function will always return the same associated y value (i.e a single-valued map).
For example, a pure function:
function pure(x) {
return x * x;
}
And impure:
var x = 0;
function impure(y) {
x = x + y;
return x++;
}
While it's easy to tell here that impure(0) !== impure(0)
, it isn't as apparent with a function such as:
function weird(x) {
if (x === "specificThing") {
return false;
} else {
return true;
}
}
or
var count = 0;
function surprise(x) {
count++;
if (count === 10e10 && x === 0) {
return true;
} else {
return false;
}
}
Another way of asking this is, is it possible to determine whether or not a javascript function is "impure", using javascript?
Theoretically it may be possible or impossible, but practically what steps can one take to start to determine this, maybe given a set of constraints or assumptions?
Another definition of purity includes the caveat that non-local variables may not be changed, but I'd like to consider that a separate problem. In this case we are considering functions that map consistent inputs to consistent outputs.
Is it possible to determine whether or not a javascript function is "pure", using javascript?
A function is said to be pure when it behaves in a predictable way, in the sense that for each x, the function will always return the same associated y value (i.e a single-valued map).
For example, a pure function:
function pure(x) {
return x * x;
}
And impure:
var x = 0;
function impure(y) {
x = x + y;
return x++;
}
While it's easy to tell here that impure(0) !== impure(0)
, it isn't as apparent with a function such as:
function weird(x) {
if (x === "specificThing") {
return false;
} else {
return true;
}
}
or
var count = 0;
function surprise(x) {
count++;
if (count === 10e10 && x === 0) {
return true;
} else {
return false;
}
}
Another way of asking this is, is it possible to determine whether or not a javascript function is "impure", using javascript?
Theoretically it may be possible or impossible, but practically what steps can one take to start to determine this, maybe given a set of constraints or assumptions?
Another definition of purity includes the caveat that non-local variables may not be changed, but I'd like to consider that a separate problem. In this case we are considering functions that map consistent inputs to consistent outputs.
Share Improve this question edited Apr 24, 2015 at 23:27 mxiong asked Apr 24, 2015 at 22:40 mxiongmxiong 2511 silver badge10 bronze badges 17-
Theoretically? Sure. Practically? No idea. Someone may have written a parser to do such a thing. Build a parse tree, mark certain things known to be impure (e.g.
Math.random()
), then work your way up. It would certainly be interesting to see if someone has written a tool to do this. – thetoast Commented Apr 24, 2015 at 22:46 - Are you asking if this can be determined without looking at the source code for the function? Or, what inputs are you allowed? Certainly a person can study the code and determine if there are any conditions where it might have impure behaviors. So, are you asking if a puter could study the code to tell you this? – jfriend00 Commented Apr 24, 2015 at 22:46
- 4 Related: programmers.stackexchange./questions/176761/… - nutshell version, you'd have to solve the halting problem to do it. – jdphenix Commented Apr 24, 2015 at 22:46
- Purity can be determined recursively: a function is pure if it does not refer to any free variables, and doesn't use any impure functions or operators. But I don't know how you would determine this IN Javascript. – Barmar Commented Apr 24, 2015 at 22:47
- 1 That Programmers question has a link to this SO question: stackoverflow./questions/1609303/… – Barmar Commented Apr 24, 2015 at 22:50
3 Answers
Reset to default 5Even your first function isn't pure, because in JavaScript *
may invoke a ToNumber conversion, which may invoke arbitrary user code, if x
happens to be an object with a user-defined toString
or valueOf
method, or somebody happened to monkey-patch Object.prototype
.
The sad truth is that in JS almost nothing can be proved pure. The only operations that can never have side effects are ===
, !==
, !
, &&
, ||
, and typeof
. This is a huge problem for optimisations in pilers, btw.
JavaScript is Turing plete, so it's just as able to parse and analyse JavaScript as any other programming language. So, the question is really: "can JavaScript functions be automatically tested for "purity", at all?"
Short answer
Only sometimes.
Long answer
For some functions, for which the AST is straight forward and all the symbols are contained, definitely. Something like function(X) { return X * X; }
is provably pure (for primitive input) because the only variables used in the function body are variables that are passed in as function arguments. This function does not rely on any additional API calls, but pure arithmetics. We can most definitely show it's pure.
Things change when we allow arbitrary content, because JavaScript has no explicit types, and instead will happily type-coerce from plex to primitive data type (or even from primitive to primitive data type) when it needs to perform an operation that can't have that operation applied. Calling our above function with an object, instead of a number, performs further function calls underwater, and those functions are not guaranteed pure at all (see Andreas's answer on this, too)
The vast majority of JS functions are nothing like our simple function. For most functions, we need to prove not only that they're pure, but that all the functions they call internally are pure, too. Now we're running into the halting problem. Let's take a ridiculously simple example:
function(x) {
if (typeof window !== "undefined") {
return x * x;
}
return x * x * x;
}
Is this pure? Well, if we run this in the browser, then in the browser it's pure, since window
is always defined. But in something like Node.js, it might be pure, but it might not be: we can't prove it is, nor can we prove it is not, because we cannot prove that this mysterious window
variable exists when the function runs. While Node.js has no global window
variable, we can trivially introduce one whenever we like, and the function's behaviour will change. Now we're suddenly faced with proving whether or not the entirety of our code will ever introduce a window
variable (and that may be very creatively done, like global["win" + _abc] = true
where _abc
is the string "dow"
). This is a lost cause.
The rabbit hole runs deep, and reading about the halting problem will make you appreciate how many difference faces the halting problem has.
sample code. limitation: can only *guess* if some code is pure = can only give a hint, but no guarantee
/* programmatically guess if some javascript code is pure or impure
npm install acorn acorn-walk
license is CC0-1.0 */
const acorn_parse = require("acorn").parse;
const acorn_walk = require("acorn-walk");
// the code to analyze
const content = `
['k1'].map(key => {
const val = data[key];
let local1;
var local2 = 2;
var local3, local4;
global1[key] = val; // find me
global2.key = val; // find me
global3.push(val); // find me
global4.pop(); // find me
global5.shift(); // find me
global6 = 'impure'; // find me
const local7 = global7[1234];
var local8 = global8.map(x => 2*x);
var local9 = global9.filter(Boolean);
const local10 = global10.pop(); // find me
local1 = 'ok';
local2.push('ok');
return [key, val];
})
`;
// method names for our educated guess
const write_method_set = new Set(['push', 'pop', 'shift']);
const read_method_set = new Set(['map', 'filter', 'reduce', 'forEach']);
const is_write_method = method_name => write_method_set.has(method_name);
const is_read_method = method_name => read_method_set.has(method_name);
const is_local = name => (name != undefined && local_var_set.has(name));
const get_src = node => content.substring(node.start, node.end);
function test_assign(node, left_name) {
if (left_name == undefined) {
console.log(`TODO implement: detect write access in:`);
console.dir(node);
return;
}
if (!is_local(left_name)) console.log(`impure write access to global ${left_name}: ${get_src(node)}`);
else console.log(`pure? write access to local ${left_name}: ${get_src(node)}`);
}
function test_call(node, left_name, method_name) {
if (left_name == undefined) {
console.log(`TODO implement: detect write access in:`)
console.dir(node);
return;
}
if (is_read_method(method_name)) return console.log(`pure? access to ${left_name}: ${get_src(node)}`);
if (!is_local(left_name)) {
if (is_write_method(method_name)) console.log(`impure write access to global ${left_name}: ${get_src(node)}`);
else console.log(`pure? access to global ${left_name}: ${get_src(node)}`);
}
else console.log(`pure? write access to local ${left_name}: ${get_src(node)}`)
}
const local_var_set = new Set();
// throws on syntax error
let ast = acorn_parse(content, { ecmaVersion: 2020, sourceType: "module" });
acorn_walk.full(ast, (node, state, type) => {
if (node.type == 'VariableDeclaration') {
node.declarations.forEach(d => {
local_var_set.add(d.id.name);
console.log(`declare local: ${d.id.name}`);
});
}
else if (node.type == 'AssignmentExpression') {
const left_name =
node.left.type == 'Identifier' ? node.left.name :
node.left.type == 'MemberExpression' ? node.left.object.name :
undefined
;
test_assign(node, left_name);
}
else if (node.type == 'CallExpression') {
if (node.callee.object.type == 'ArrayExpression') return; // simply ignore
const left_name =
node.callee.type == 'MemberExpression' ? node.callee.object.name :
undefined
;
const method_name =
node.callee.type == 'MemberExpression' ? node.callee.property.name :
undefined
;
test_call(node, left_name, method_name);
}
//else console.dir(node);
});
sample output
$ node test.js | grep impure
impure write access to global global1: global1[key] = val
impure write access to global global2: global2.key = val
impure write access to global global3: global3.push(val)
impure write access to global global4: global4.pop()
impure write access to global global5: global5.shift()
impure write access to global global6: global6 = 'impure'
impure write access to global global10: global10.pop()