Should you read this guide?
If you’re writing anything more complicated than a brief command-line script, reading this should help you write higher-performance, more-secure applications. This document is written with Node.js servers in mind, but the concepts apply to complex Node.js applications as well. Where OS-specific details vary, this document is Linux-centric.Summary
Node.js runs JavaScript code in the Event Loop (initialization and callbacks), and offers a Worker Pool to handle expensive tasks like file I/O. Node.js scales well, sometimes better than more heavyweight approaches like Apache. The secret to the scalability of Node.js is that it uses a small number of threads to handle many clients. If Node.js can make do with fewer threads, then it can spend more of your system’s time and memory working on clients rather than on paying space and time overheads for threads (memory, context-switching). But because Node.js has only a few threads, you must structure your application to use them wisely. Here’s a good rule of thumb for keeping your Node.js server speedy:Why should I avoid blocking the Event Loop and the Worker Pool?
Node.js uses a small number of threads to handle many clients. In Node.js there are two types of threads: one Event Loop (aka the main loop, main thread, event thread, etc.), and a pool ofk Workers in a Worker Pool (aka the threadpool).
If a thread is taking a long time to execute a callback (Event Loop) or a task (Worker), we call it “blocked”. While a thread is blocked working on behalf of one client, it cannot handle requests from any other clients. This provides two motivations for blocking neither the Event Loop nor the Worker Pool:
Performance
If you regularly perform heavyweight activity on either type of thread, the throughput (requests/second) of your server will suffer.
Security
If it is possible that for certain input one of your threads might block, a malicious client could submit this “evil input”, make your threads block, and keep them from working on other clients. This would be a Denial of Service attack.
A quick review of Node
Node.js uses the Event-Driven Architecture: it has an Event Loop for orchestration and a Worker Pool for expensive tasks.What code runs on the Event Loop?
When they begin, Node.js applications first complete an initialization phase,require’ing modules and registering callbacks for events. Node.js applications then enter the Event Loop, responding to incoming client requests by executing the appropriate callback. This callback executes synchronously, and may register asynchronous requests to continue processing after it completes. The callbacks for these asynchronous requests will also be executed on the Event Loop.
The Event Loop will also fulfill the non-blocking asynchronous requests made by its callbacks, e.g., network I/O.
In summary, the Event Loop executes the JavaScript callbacks registered for events, and is also responsible for fulfilling non-blocking asynchronous requests like network I/O.
What code runs on the Worker Pool?
The Worker Pool of Node.js is implemented in libuv (docs), which exposes a general task submission API. Node.js uses the Worker Pool to handle “expensive” tasks. This includes I/O for which an operating system does not provide a non-blocking version, as well as particularly CPU-intensive tasks. These are the Node.js module APIs that make use of this Worker Pool:- I/O-intensive
- DNS:
dns.lookup(),dns.lookupService(). - File System: All file system APIs except
fs.FSWatcher()and those that are explicitly synchronous use libuv’s threadpool.
- DNS:
- CPU-intensive
How does Node.js decide what code to run next?
Abstractly, the Event Loop and the Worker Pool maintain queues for pending events and pending tasks, respectively. In truth, the Event Loop does not actually maintain a queue. Instead, it has a collection of file descriptors that it asks the operating system to monitor, using a mechanism like epoll (Linux), kqueue (OSX), event ports (Solaris), or IOCP (Windows). These file descriptors correspond to network sockets, any files it is watching, and so on. When the operating system says that one of these file descriptors is ready, the Event Loop translates it to the appropriate event and invokes the callback(s) associated with that event. In contrast, the Worker Pool uses a real queue whose entries are tasks to be processed. A Worker pops a task from this queue and works on it, and when finished the Worker raises an “At least one task is finished” event for the Event Loop.What does this mean for application design?
In a one-thread-per-client system like Apache, each pending client is assigned its own thread. If a thread handling one client blocks, the operating system will interrupt it and give another client a turn. The operating system thus ensures that clients that require a small amount of work are not penalized by clients that require more work. Because Node.js handles many clients with few threads, if a thread blocks handling one client’s request, then pending client requests may not get a turn until the thread finishes its callback or task. The fair treatment of clients is thus the responsibility of your application. This means that you shouldn’t do too much work for any client in any single callback or task.Don’t block the Event Loop
The Event Loop notices each new client connection and orchestrates the generation of a response. All incoming requests and outgoing responses pass through the Event Loop. This means that if the Event Loop spends too long at any point, all current and new clients will not get a turn. You should make sure you never block the Event Loop. In other words, each of your JavaScript callbacks should complete quickly. This of course also applies to yourawait’s, your Promise.then’s, and so on.
A good way to ensure this is to reason about the “computational complexity” of your callbacks. If your callback takes a constant number of steps no matter what its arguments are, then you’ll always give every pending client a fair turn. If your callback takes a different number of steps depending on its arguments, then you should think about how long the arguments might be.
How careful should you be?
Node.js uses the Google V8 engine for JavaScript, which is quite fast for many common operations. Exceptions to this rule are regexps and JSON operations, discussed below. However, for complex tasks you should consider bounding the input and rejecting inputs that are too long. That way, even if your callback has large complexity, by bounding the input you ensure the callback cannot take more than the worst-case time on the longest acceptable input.Blocking the Event Loop: REDOS
One common way to block the Event Loop disastrously is by using a “vulnerable” regular expression.Avoiding vulnerable regular expressions
A regular expression (regexp) matches an input string against a pattern. We usually think of a regexp match as requiring a single pass through the input string —O(n) time where n is the length of the input string. In many cases, a single pass is indeed all it takes. Unfortunately, in some cases the regexp match might require an exponential number of trips through the input string — O(2^n) time.
A vulnerable regular expression is one on which your regular expression engine might take exponential time, exposing you to REDOS on “evil input”.
A REDOS example
Here is an example vulnerable regexp exposing its server to REDOS:///.../\n (100 /‘s followed by a newline character that the regexp’s ”.” won’t match), then the Event Loop will take effectively forever, blocking the Event Loop.
Anti-REDOS resources
There are some tools to check your regexps for safety, like safe-regex and rxxr2. However, neither of these will catch all vulnerable regexps. Another approach is to use a different regexp engine. You could use the node-re2 module, which uses Google’s blazing-fast RE2 regexp engine. But be warned, RE2 is not 100% compatible with V8’s regexps, so check for regressions if you swap in the node-re2 module to handle your regexps. If you’re trying to match something “obvious”, like a URL or a file path, find an example in a regexp library or use an npm module, e.g. ip-regex.Blocking the Event Loop: Node.js core modules
Several Node.js core modules have synchronous expensive APIs, including: These APIs are expensive, because they involve significant computation (encryption, compression), require I/O (file I/O), or potentially both (child process). These APIs are intended for scripting convenience, but are not intended for use in the server context. If you execute them on the Event Loop, they will take far longer to complete than a typical JavaScript instruction, blocking the Event Loop.Blocking the Event Loop: JSON DOS
JSON.parse and JSON.stringify are other potentially expensive operations. While these are O(n) in the length of the input, for large n they can take surprisingly long.
If your server manipulates JSON objects, particularly those from a client, you should be cautious about the size of the objects or strings you work with on the Event Loop.
- JSONStream, which has stream APIs.
- Big-Friendly JSON, which has stream APIs as well as asynchronous versions of the standard JSON APIs using the partitioning-on-the-Event-Loop paradigm outlined below.
Complex calculations without blocking the Event Loop
Suppose you want to do complex calculations in JavaScript without blocking the Event Loop. You have two options: partitioning or offloading.Partitioning
You could partition your calculations so that each runs on the Event Loop but regularly yields (gives turns to) other pending events. In JavaScript it’s easy to save the state of an ongoing task in a closure. For a simple example, suppose you want to compute the average of the numbers1 to n.
Offloading
If you need to do something more complex, partitioning is not a good option. This is because partitioning uses only the Event Loop, and you won’t benefit from multiple cores almost certainly available on your machine. Remember, the Event Loop should orchestrate client requests, not fulfill them itself. For a complicated task, move the work off of the Event Loop onto a Worker Pool. You have two options for a destination Worker Pool to which to offload work:- You can use the built-in Node.js Worker Pool by developing a C++ addon. On older versions of Node, build your C++ addon using NAN, and on newer versions use N-API.
- You can create and manage your own Worker Pool dedicated to computation rather than the Node.js I/O-themed Worker Pool. The most straightforward ways to do this is using Child Process or Cluster.
- A CPU-intensive task only makes progress when its Worker is scheduled, and the Worker must be scheduled onto one of your machine’s logical cores. If you have 4 logical cores and 5 Workers, one of these Workers cannot make progress.
- I/O-intensive tasks involve querying an external service provider (DNS, file system, etc.) and waiting for its response. While a Worker with an I/O-intensive task is waiting for its response, it has nothing else to do and can be de-scheduled by the operating system, giving another Worker a chance to submit their request.
Don’t block the Worker Pool
Node.js has a Worker Pool composed ofk Workers. Your goal should be to minimize the variation in Task times, and you should use Task partitioning to accomplish this.
Minimizing the variation in Task times
If a Worker’s current Task is much more expensive than other Tasks, then it will be unavailable to work on other pending Tasks. In other words, each relatively long Task effectively decreases the size of the Worker Pool by one until it is completed. To avoid this, you should try to minimize variation in the length of Tasks you submit to the Worker Pool. Two examples should illustrate the possible variation in task times: Variation example: Long-running file system reads Suppose your server must read files in order to handle some client requests.fs.readFile() before v10 was not partitioned: it submitted a single fs.read() Task spanning the entire file. If you read shorter files for some users and longer files for others, fs.readFile() may introduce significant variation in Task lengths.
For a worst-case scenario, suppose an attacker can convince your server to read an arbitrary file (this is a directory traversal vulnerability). If your server is running Linux, the attacker can name an extremely slow file: /dev/random. For all practical purposes, /dev/random is infinitely slow, and every Worker asked to read from /dev/random will never finish that Task.
Variation example: Long-running crypto operations
Suppose your server generates cryptographically secure random bytes using crypto.randomBytes(). crypto.randomBytes() is not partitioned: it creates a single randomBytes() Task to generate as many bytes as you requested. If you create fewer bytes for some users and more bytes for others, crypto.randomBytes() is another source of variation in Task lengths.
Task partitioning
Tasks with variable time costs can harm the throughput of the Worker Pool. To minimize variation in Task times, as far as possible you should partition each Task into comparable-cost sub-Tasks. When each sub-Task completes it should submit the next sub-Task, and when the final sub-Task completes it should notify the submitter. To continue thefs.readFile() example, you should instead use fs.read() (manual partitioning) or ReadStream (automatically partitioned).
When you partition a Task into sub-Tasks, shorter Tasks expand into a small number of sub-Tasks, and longer Tasks expand into a larger number of sub-Tasks. Between each sub-Task of a longer Task, the Worker to which it was assigned can work on a sub-Task from another, shorter, Task, thus improving the overall Task throughput of the Worker Pool.
Avoiding Task partitioning
Recall that the purpose of Task partitioning is to minimize the variation in Task times. If you can distinguish between shorter Tasks and longer Tasks (e.g. summing an array vs. sorting an array), you could create one Worker Pool for each class of Task. Routing shorter Tasks and longer Tasks to separate Worker Pools is another way to minimize Task time variation. The downside of this approach is that Workers in all of these Worker Pools will incur space and time overheads and will compete with each other for CPU time. Remember that each CPU-bound Task makes progress only while it is scheduled. As a result, you should only consider this approach after careful analysis.The risks of npm modules
While the Node.js core modules offer building blocks for a wide variety of applications, sometimes something more is needed. Node.js developers benefit tremendously from the npm ecosystem, with hundreds of thousands of modules offering functionality to accelerate your development process. Remember, however, that the majority of these modules are written by third-party developers and are generally released with only best-effort guarantees. A developer using an npm module should be concerned about two things:- Does it honor its APIs?
- Might its APIs block the Event Loop or a Worker?
Conclusion
Node.js has two types of threads: one Event Loop andk Workers. The Event Loop is responsible for JavaScript callbacks and non-blocking I/O, and a Worker executes tasks corresponding to C++ code that completes an asynchronous request, including blocking I/O and CPU-intensive work. Both types of threads work on no more than one activity at a time. If any callback or task takes a long time, the thread running it becomes blocked. If your application makes blocking callbacks or tasks, this can lead to degraded throughput (clients/second) at best, and complete denial of service at worst.
To write a high-throughput, more DoS-proof web server, you must ensure that on benign and on malicious input, neither your Event Loop nor your Workers will block.