Scale Rammerhead with multi-threading

Rammerhead includes built-in multi-threading support through Node.js’s cluster module and the sticky-session-custom package. When enabled, a master process accepts connections and routes each request to a worker process, distributing load across CPU cores while ensuring every request belonging to a session lands on the same worker.

Enabling workers

Two options in config.js control multi-threading:

// src/config.js (defaults)
const os = require('os');

const enableWorkers = os.cpus().length !== 1;

module.exports = {
  enableWorkers,                // false on single-core machines, true otherwise
  workers: os.cpus().length,    // one worker per logical CPU
};

enableWorkers is automatically false on single-core machines so that development environments do not incur the overhead of the cluster module. On a production server with multiple cores, it is true by default. To override the defaults in your root config.js:

// config.js (root of repository)
const os = require('os');

module.exports = {
  enableWorkers: true,
  workers: os.cpus().length,    // recommended: match the number of logical CPUs
};

Setting workers higher than os.cpus().length rarely improves throughput and increases memory usage because each worker loads the full Rammerhead session store.

Why sticky sessions matter

Each Rammerhead session is stored in memory on the worker that created it. If a second request for the same session is sent to a different worker, that worker has no knowledge of the session and will reject or mishandle the request. Sticky session routing ensures all requests for a given session always reach the same worker.

How request routing works

The master process uses sticky-session-custom to route incoming connections. Before a connection is handed off, the master reads the raw HTTP request bytes and extracts the session ID. It then hashes the session ID to a consistent worker index. The extraction logic is implemented in generatePrehashArray inside src/server/index.js:

// src/server/index.js
generatePrehashArray(req) {
    let sessionId = getSessionId(req.url); // /sessionid/url
    if (!sessionId) {
        // /editsession?id=sessionid
        const parsed = new URL(req.url, 'https://a.com');
        sessionId = parsed.searchParams.get('id') || parsed.searchParams.get('sessionId');
        if (!sessionId) {
            // sessionId is in referer header
            for (let i = 0; i < req.headers.length; i += 2) {
                if (req.headers[i].toLowerCase() === 'referer') {
                    sessionId = getSessionId(req.headers[i + 1]);
                    break;
                }
            }
            if (!sessionId) {
                // if there is still none, it's likely a static asset, in which case,
                // just delegate it to a worker
                sessionId = ' ';
            }
        }
    }
    return sessionId.split('').map((e) => e.charCodeAt());
}

The function tries to find the session ID in three places, in order:

URL path — for proxied page requests in the form /<sessionId>/<encoded-url>
Query parameters — for management endpoints such as /editsession?id=<sessionId>
Referer header — for sub-resources (images, scripts) fetched by the browser where the URL itself does not contain the session ID

If no session ID is found (e.g., for static assets like rammerhead.js), a single space character is used as the hash input, distributing these requests round-robin across workers. sticky-session-custom converts the returned character-code array into a consistent hash and maps it to a worker, ensuring repeatability.

Stale session cleanup

Session file cleanup (deleting sessions that have not been used for three days) only runs on the master process. Worker processes have staleCleanupOptions set to null to avoid multiple workers competing to delete the same files:

// src/server/index.js
const fileCacheOptions = { logger, ...config.fileCacheSessionConfig };
if (!cluster.isMaster) {
    fileCacheOptions.staleCleanupOptions = null;
}

This means the cleanup schedule defined in fileCacheSessionConfig.staleCleanupOptions is respected exactly once, by the master.

The `dontListen` option

When enableWorkers is true, the RammerheadProxy constructor receives dontListen: true. This prevents the proxy from calling server.listen() itself. Instead, sticky-session-custom takes ownership of the socket binding and hands off pre-accepted connections to the correct worker:

// src/server/index.js
const proxyServer = new RammerheadProxy({
    // ...
    dontListen: config.enableWorkers,
    // ...
});

Workers never bind to a port directly; they only receive connections forwarded by the master.

Dual-port vs. single-port configuration

Dual-port (default)
Single-port

When crossDomainPort is set, sticky-session-custom creates two separate load balancers — one for the main port and one for the cross-domain port:

// src/server/index.js
const closeMasters = [
  sticky.listen(proxyServer.server1, config.port, config.bindingAddress, stickyOptions)
];
if (config.crossDomainPort) {
    closeMasters.push(
        sticky.listen(proxyServer.server2, config.crossDomainPort, config.bindingAddress, stickyOptions)
    );
}

Both load balancers use the same generatePrehashArray function, so a request on the cross-domain port for a given session routes to the same worker as the main-port requests for that session.

Set crossDomainPort: null to use only one port. The second sticky.listen call is skipped, and proxyServer.server2 is never bound:

// config.js
module.exports = {
  crossDomainPort: null,
};

Single-port mode disables accurate cross-origin request simulation. Most sites still work, but sites that strictly validate the Origin header may behave differently.

Recommended worker count

Set workers to os.cpus().length (the default). This matches one worker per logical CPU, keeping context-switching low:

// config.js
const os = require('os');

module.exports = {
  enableWorkers: true,
  workers: os.cpus().length,
};

On a 4-core machine this starts 4 workers plus the master process (5 Node.js processes total). Each worker holds an independent in-memory JS rewrite cache, so total memory usage scales linearly with the number of workers. Factor this in when sizing your server.

If you are running other memory-intensive processes on the same host, reduce workers accordingly. Each worker loads a full copy of Rammerhead, including the JS rewrite cache.

Get Started

Core Concepts

Deployment

Guides

Scale Rammerhead with multi-threading

Enabling workers

Why sticky sessions matter

How request routing works

Stale session cleanup

The `dontListen` option

Dual-port vs. single-port configuration

Recommended worker count

Build docs developers (and LLMs) love

Get Started

Core Concepts

Deployment

Guides

Documentation Index

​Enabling workers

​Why sticky sessions matter

​How request routing works

​Stale session cleanup

​The dontListen option

​Dual-port vs. single-port configuration

​Recommended worker count

Build docs developers (and LLMs) love

Enabling workers

Why sticky sessions matter

How request routing works

Stale session cleanup

The `dontListen` option

Dual-port vs. single-port configuration

Recommended worker count