javascript - What causes node.js to wait until the request finishes?

So I wasted a bunch of time writing some code like this:

function processResponse(error, response, body) {
    if (!error && response.statusCode == 200) {
        console.log(body);
    } else {
        console.error(util.inspect(response, false, null));
    }
    waiting = false;
};

request.get(requestOpts.url, processResponse);

console.log("Waiting");
while(waiting) {
    count += 1;
    if(count % 10000000 == 0) {
        console.log(count);
    }
}

I was trying to get node to wait (and not exit) until the reponse came back from the webserver. Turns out, this didint' work, and what did work was doing nothing. Just:

request.get(requestOpts.url, processResponse);

How did request keep node from exiting while the callback was pending?

So I wasted a bunch of time writing some code like this:

function processResponse(error, response, body) {
    if (!error && response.statusCode == 200) {
        console.log(body);
    } else {
        console.error(util.inspect(response, false, null));
    }
    waiting = false;
};

request.get(requestOpts.url, processResponse);

console.log("Waiting");
while(waiting) {
    count += 1;
    if(count % 10000000 == 0) {
        console.log(count);
    }
}

I was trying to get node to wait (and not exit) until the reponse came back from the webserver. Turns out, this didint' work, and what did work was doing nothing. Just:

request.get(requestOpts.url, processResponse);

How did request keep node from exiting while the callback was pending?

Share Improve this question asked Feb 1, 2016 at 3:48 boatcoder 18.1k21 gold badges124 silver badges186 bronze badges

1 The asynchronous primitive that request uses (in the http module?) does tell node not to exit while there's a callback waiting for a result. – Bergi Commented Feb 1, 2016 at 3:51
Could you elaborate on this primitive? – boatcoder Commented Feb 1, 2016 at 4:04
It's native. It's implemented in the internals of the node platform. It talks to the event loop directly. There's not much more about it to say. – Bergi Commented Feb 1, 2016 at 4:55
1 Google "event loop". Javascript is one of the two languages I know of with an event loop at its heart. The other is Tcl. – slebetman Commented Feb 1, 2016 at 6:57
1 I might be horribly wrong but: that (endless) loop will block the event loop from processing received data and since it's stuck in that loop no data can be read from the socket and thus not be handled. – try-catch-finally Commented Apr 28, 2018 at 14:08

| Show 1 more comment

3 Answers 3

Sorted by: Reset to default 12

Node always keep track of any pending callbacks and will not exit until that hits zero. This will include all active network connections/requests as well as filesystem IO and subprocess activity. There's nothing special you need to code to get the expected behavior. node will do what you expect in this case by default.

TLDR

In the code from the OP, the synchronous while-loop prevents the event loop from ever reaching the poll phase, so the event loop gets stuck on its first tick and the network I/O is never processed.

Complete answer

Node.js's asynchronous programming model is based on a synchronous loop called the event loop. The basic abstraction of the event loop is function scheduling: a function in Node.js is able to schedule other functions (like network request handlers and timers) to run at some point in the future and in response to some event.

The event loop is basically a continuously running "while loop". During each "tick" of the loop, the Node.js runtime checks to see whether the condition for a scheduled function is met -- if the condition is met (like if a timer has elapsed), the function is executed.

The event loop processes callbacks in a particular order.

   ┌───────────────────────────┐
┌─>│           timers          │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │     pending callbacks     │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │       idle, prepare       │
│  └─────────────┬─────────────┘      ┌───────────────┐
│  ┌─────────────┴─────────────┐      │   incoming:   │
│  │           poll            │<─────┤  connections, │
│  └─────────────┬─────────────┘      │   data, etc.  │
│  ┌─────────────┴─────────────┐      └───────────────┘
│  │           check           │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
└──┤      close callbacks      │
   └───────────────────────────┘

The only place for network requests to be processed is during the poll phase; so in order for a network request to be processed by a callback, the event loop must reach the poll phase.

Each stage of the event loop is only able to advance after the synchronous functions from the previous stage have finished running -- i.e. they have returned, and the call stack is empty.

In the code from the OP, the synchronous while-loop prevents the event loop from ever reaching the poll phase, so the network request handler is never executed.

Looping while waiting on a network request

In order to run code in a loop while we wait on the network request, we need to somehow give the event loop an opportunity to process other callbacks.

We can achieve this by running our loop in a callback scheduled via setTimeout(). setTimeout() is a mechanism for scheduling functions to run during the timers phase of the event loop.

Each time our loop finishes, the event loop has an opportunity to process new I/O events. If there's no new I/O event, it will move on to the next setTimeout() handler scheduled via loop().

const request = require('request')

let count = 0
let isFinished = false

request.get('https://www.google.com', (err, res) => (isFinished = true))

console.log('Waiting')
loop()

function loop () {
  setTimeout(
    () => {
      count += 1
      if (isFinished) {
        console.log('done')
        return
      }

      if(count % 10 == 0) {
          console.log(count);
      }

      return loop()
    },
    0
  )
}

In this example, we keep "recursively" calling loop until the network request has finished. Each time the setTimeout() handler (scheduled by loop()) returns, the event loop moves beyond the timers phase and checks for new network I/O (i.e. our request to Google). As soon as the response is ready, our network request handler is called during the poll phase.

Even though loop() is "recursive", it doesn't increase the size of the call stack. Since each loop iteration is scheduled via setTimeout, the loop() function won't push another loop() call onto the call stack until the setTimeout() handler is run in the next timers phase - at which point the call stack from the previous loop() call will already be cleared.

This code should be enough:

function processResponse(error, response, body) {
    console.log("Request complete");
    if (!error && response.statusCode == 200) {
        console.log(body);
    } else {
        console.error(util.inspect(response, false, null));
    }
};

console.log("Sending a request");
// Just as a test, try passing the URL hardcoded here,
// because your problem might be just a invalid requestOpts object
request.get("http://stackoverflow.com", processResponse);

One thing that is totally wrong with your previous code is the fact that loop is actually preventing the request to be executed. Node.js has a single event loop thread, and that code is just consuming 100% CPU, not giving room to your request to be executed.

However, it that code above does not work as expected, it might be related to the request library that you are using. If you are using request-promise instead of request, you need to use a different syntax.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

javascript - What causes node.js to wait until the request finishes? - Stack Overflow

3 Answers 3

与本文相关的文章

评论列表(0)