I have an application that I'm writing in Node.js which needs to make a lot of configuration and database calls in order to process user data. The issue I'm having is that after 11,800+ function calls Node will throw an error and exit the process.
The error says: RangeError: Maximum call stack size exceeded
I'm curious if anyone else has had this situation arise and to know how they handled this. I've already started to break up my code into a couple of extra worker files but even so each time I process a data node it needs to touch 2 databases (at most 25 calls to update various tables) and do a number of sanitization checks.
I am totally willing to admit that I'm possibly doing something non-optimal if that is the case but would appreciate some guidance if there is a more optimal manner.
Here is an example of the code I'm running on data:
app.post('/initspeaker', function(req, res) {
// if the Admin ID is not present ignore
if(req.body.xyzid!=config.adminid) {
res.send( {} );
return;
}
var gcnt = 0, dbsize = 0, goutput = [], goutputdata = [], xyzuserdataCallers = [];
xyz.loadbatchfile( xyz.getbatchurl("speakers", "csv"), function(data) {
var parsed = csv.parse(data);
console.log("lexicon", parsed[0]);
for(var i=1;i<parsed.length;i++) {
if(typeof parsed[i][0] != 'undefined' && parsed[i][0]!='name') {
var xyzevent = require('./lib/model/xyz_speaker').create(parsed[i], parsed[0]);
xyzevent.isPresenter = true;
goutput.push(xyzevent);
}
}
dbsize = goutput.length;
xyzuserdataCallers = [new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata()
];
// insert all Scheduled Items into the DB
xyzuserdataCallers[0].sendSpeakerData(goutput[0]);
for(var i=1;i<xyzuserdataCallers;i++) {
xyzuserdataCallers[i].sendSpeakerData(8008);
}
//sendSpeakerData(goutput[0]);
});
var callback = function(data, func) {
//console.log(data);
if(data && data!=8008) {
if(gcnt>=dbsize) {
res.send("done");
} else {
gcnt++;
func.sendSpeakerData(goutput[gcnt]);
}
} else {
gcnt++;
func.sendSpeakerData(goutput[gcnt]);
}
};
// callback loop for fetching registrants for events from SMW
var xyzuserdata = function() {};
xyzuserdata.prototype.sendSpeakerData = function(data) {
var thisfunc = this;
if(data && data!=8008) {
//console.log('creating user from data', gcnt, dbsize);
var userdata = require('./lib/model/user').create(data.toObject());
var speakerdata = userdata.toObject();
speakerdata.uid = uuid.v1();
speakerdata.isPresenter = true;
couchdb.insert(speakerdata, config.couch.db.user, function($data) {
if($data==false) {
// if this fails it is probably due to a UID colliding
console.log("*** trying user data again ***");
speakerdata.uid = uuid.v1();
arguments.callee( speakerdata );
} else {
callback($data, thisfunc);
}
});
} else {
gcnt++;
arguments.callee(goutput[gcnt]);
}
};
});
A couple of classes and items are defined here that need some introduction:
- I am using Express.js + hosted CouchDB and this is responding to a POST request
- There is a CSV parser class that loads a list of events which drives pulling speaker data
- Each event can have n number of users (currently around 8K users for all events)
- I'm using a pattern that loads all of the data/users before attempting to parse any of them
- Each user loaded (external data source) is converted into an object I can use and also sanitized (strip slashes and such)
- Each user is then inserted into CouchDB
This code works in the app but after a while I get an error saying that over 11,800+ calls have been made and the app breaks. This isn't an error that contains a stack trace like one would see if it was code error, it is exiting due to the number of calls being done.
Again, any assistance/mentary/direction would be appreciated.
I have an application that I'm writing in Node.js which needs to make a lot of configuration and database calls in order to process user data. The issue I'm having is that after 11,800+ function calls Node will throw an error and exit the process.
The error says: RangeError: Maximum call stack size exceeded
I'm curious if anyone else has had this situation arise and to know how they handled this. I've already started to break up my code into a couple of extra worker files but even so each time I process a data node it needs to touch 2 databases (at most 25 calls to update various tables) and do a number of sanitization checks.
I am totally willing to admit that I'm possibly doing something non-optimal if that is the case but would appreciate some guidance if there is a more optimal manner.
Here is an example of the code I'm running on data:
app.post('/initspeaker', function(req, res) {
// if the Admin ID is not present ignore
if(req.body.xyzid!=config.adminid) {
res.send( {} );
return;
}
var gcnt = 0, dbsize = 0, goutput = [], goutputdata = [], xyzuserdataCallers = [];
xyz.loadbatchfile( xyz.getbatchurl("speakers", "csv"), function(data) {
var parsed = csv.parse(data);
console.log("lexicon", parsed[0]);
for(var i=1;i<parsed.length;i++) {
if(typeof parsed[i][0] != 'undefined' && parsed[i][0]!='name') {
var xyzevent = require('./lib/model/xyz_speaker').create(parsed[i], parsed[0]);
xyzevent.isPresenter = true;
goutput.push(xyzevent);
}
}
dbsize = goutput.length;
xyzuserdataCallers = [new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata(),
new xyzuserdata()
];
// insert all Scheduled Items into the DB
xyzuserdataCallers[0].sendSpeakerData(goutput[0]);
for(var i=1;i<xyzuserdataCallers;i++) {
xyzuserdataCallers[i].sendSpeakerData(8008);
}
//sendSpeakerData(goutput[0]);
});
var callback = function(data, func) {
//console.log(data);
if(data && data!=8008) {
if(gcnt>=dbsize) {
res.send("done");
} else {
gcnt++;
func.sendSpeakerData(goutput[gcnt]);
}
} else {
gcnt++;
func.sendSpeakerData(goutput[gcnt]);
}
};
// callback loop for fetching registrants for events from SMW
var xyzuserdata = function() {};
xyzuserdata.prototype.sendSpeakerData = function(data) {
var thisfunc = this;
if(data && data!=8008) {
//console.log('creating user from data', gcnt, dbsize);
var userdata = require('./lib/model/user').create(data.toObject());
var speakerdata = userdata.toObject();
speakerdata.uid = uuid.v1();
speakerdata.isPresenter = true;
couchdb.insert(speakerdata, config.couch.db.user, function($data) {
if($data==false) {
// if this fails it is probably due to a UID colliding
console.log("*** trying user data again ***");
speakerdata.uid = uuid.v1();
arguments.callee( speakerdata );
} else {
callback($data, thisfunc);
}
});
} else {
gcnt++;
arguments.callee(goutput[gcnt]);
}
};
});
A couple of classes and items are defined here that need some introduction:
- I am using Express.js + hosted CouchDB and this is responding to a POST request
- There is a CSV parser class that loads a list of events which drives pulling speaker data
- Each event can have n number of users (currently around 8K users for all events)
- I'm using a pattern that loads all of the data/users before attempting to parse any of them
- Each user loaded (external data source) is converted into an object I can use and also sanitized (strip slashes and such)
- Each user is then inserted into CouchDB
This code works in the app but after a while I get an error saying that over 11,800+ calls have been made and the app breaks. This isn't an error that contains a stack trace like one would see if it was code error, it is exiting due to the number of calls being done.
Again, any assistance/mentary/direction would be appreciated.
Share Improve this question edited Feb 6, 2013 at 12:27 Ron Wertlen 83210 silver badges24 bronze badges asked Feb 1, 2012 at 17:27 LiamLiam 1,7681 gold badge17 silver badges32 bronze badges 4- Constraints and stored procedures. – Ryan Olds Commented Feb 1, 2012 at 17:33
- Not sure what you mean. Since I'm using CouchDB as the data store there aren't any stored procedures on the database side. As for constraints, could you elaborate a bit? – Liam Commented Feb 1, 2012 at 17:35
- What's that actual text of the error? – mike Commented Feb 1, 2012 at 20:00
- The error says "RangeError: Maximum call stack size exceeded" and I've added this to the question – Liam Commented Feb 1, 2012 at 20:37
2 Answers
Reset to default 5It looks like xyzuserdata.sendSpeakerData & callback are being used recursively in order to keep the DB calls sequential. At some point you run out of call stack...
There's several modules to make serial execution easier, like Step or Flow-JS.
Flow-JS
even has a convenience function to apply a function serially over the elements of the array:
flow.serialForEach(goutput, xyzuserdata.sendSpeakerData, ...)
I wrote a small test program using flow.serialForEach, but unfortunately was able to get a Maximum call stack size exceeded
error -- Looks like Flow-JS is using the call stack in a similar way to keep things in sync.
Another approach that doesn't build up the call stack is to avoid recursion and use setTimeout with a timeout value of 0 to schedule the callback call. See http://metaduck./post/2675027550/asynchronous-iteration-patterns-in-node-js
You could try replacing the callback call with
setTimeout(callback, 0, [$data, thisfunc])
Recursion is very useful for synchronizing async operations -- that's why it is used in flow.js etc.
However if you want to process an unlimited number of elements in an array, or buffered stream, you will need to use node.js's event emitter.
in pseudo-ish-code:
ee = eventemitter
arr = A_very_long_array_to_process
callback = callback_to_call_once_either_with_an_error_or_when_done
// the worker function does everything
processOne() {
var
next = arr. shift();
if( !arr )
ee.emit ( 'finished' )
return
process( function( err, response) {
if( err )
callback( err, response )
else
ee.emit( 'done-one' )
} );
}
// here we process the final event that the worker will throw when done
ee.on( 'finished', function() { callback( null, 'we processed the entire array!'); } );
// here we say what to do after one thing has been processed
ee.on( 'done-one', function() { processOne(); } );
// here we get the ball rolling
processOne();