Node.js Cluster Module

What is the Cluster Module?

The Cluster module provides a way to create multiple worker processes that share the same server port.

Since Node.js is single-threaded by default, the Cluster module helps your application utilize multiple CPU cores, significantly improving performance on multi-core systems.

Each worker runs in its own process with its own event loop and memory space, but they all share the same server port.

The master process is responsible for creating workers and distributing incoming connections among them.

Importing the Cluster Module

The Cluster module is included in Node.js by default.

You can use it by requiring it in your script:

How Clustering Works

The Cluster module works by creating a master process that spawns multiple worker processes.

The master process doesn't execute the application code but manages the workers.

Each worker process is a new Node.js instance that runs your application code independently.

Note: Under the hood, the Cluster module uses the Child Process module's fork() method to create new workers.

Process Type	Responsibility
Master	Creating and managing worker processes Monitoring worker health Restarting crashed workers Load balancing (distributing connections)
Worker	Running the actual application code Handling incoming requests Processing data Executing business logic

Creating a Basic Cluster

Here's a simple example of creating a cluster with worker processes for each CPU:

In this example:

The master process detects the number of CPU cores
It forks one worker per CPU
Each worker creates an HTTP server on the same port (8000)
The cluster module automatically load balances the incoming connections
If a worker crashes, the master creates a new one

Worker Communication

You can communicate between master and worker processes using the send() method and message events, similar to how IPC works in the Child Process module.

Zero-Downtime Restart

One of the main benefits of clustering is the ability to restart workers without downtime. This is useful for deploying updates to your application.

This example demonstrates:

Creating an initial set of workers
Replacing each worker one by one
Ensuring a new worker is listening before disconnecting the old one
Gracefully handling unexpected worker deaths

Load Balancing

The Cluster module has built-in load balancing for distributing incoming connections among worker processes.

There are two primary strategies:

Round-Robin (default)

By default on all platforms except Windows, Node.js distributes connections using a round-robin approach, where the master accepts connections and distributes them across workers in a circular sequence.

Note: On Windows, the load distribution behaves differently due to how Windows handles ports. In Windows, the workers compete to accept connections.

Primary Worker

You can also let each worker accept connections directly by setting cluster.schedulingPolicy:

Shared State

Since each worker runs in its own process with its own memory space, they cannot directly share state via variables. Instead, you can:

Use IPC messaging (as shown in the communication example)
Use external storage like Redis, MongoDB, or a file system
Use sticky load balancing for session management

Sticky Sessions Example

Sticky sessions ensure that requests from the same client always go to the same worker process:

This is a simplified example showing the concept of sticky sessions. In production, you'd typically:

Use a more sophisticated hashing algorithm
Use cookies or other session identifiers instead of IP addresses
Handle socket connections more carefully

Worker Lifecycle

Understanding the worker lifecycle is important for properly managing your cluster:

Event	Description
`fork`	Emitted when a new worker is forked
`online`	Emitted when the worker is running and ready to process messages
`listening`	Emitted when the worker starts listening for connections
`disconnect`	Emitted when a worker's IPC channel is disconnected
`exit`	Emitted when a worker process exits

Graceful Shutdown

A graceful shutdown is important to allow your worker processes to finish handling existing requests before they exit.

Best Practices

Number of Workers: In most cases, create one worker per CPU core
Stateless Design: Design your application to be stateless to work effectively with clusters
Graceful Shutdown: Implement proper shutdown handling to avoid dropping connections
Worker Monitoring: Monitor and replace crashed workers promptly
Database Connections: Each worker has its own connection pool, so configure database connections appropriately
Shared Resources: Be careful with resources shared between workers (e.g., file locks)
Keep Workers Lean: Avoid unnecessary memory usage in worker processes

Warning: Be careful with file-based locking and other shared resources when using multiple workers. Operations that were safe in a single-process application may cause race conditions with multiple workers.

Alternatives to the Cluster Module

While the Cluster module is powerful, there are alternatives for running Node.js applications on multiple cores:

Approach	Description	Use Case
PM2	A process manager for Node.js applications with built-in load balancing and clustering	Production applications that need robust process management
Load Balancer	Running multiple Node.js instances behind a load balancer like Nginx	Distributing load across multiple servers or containers
Worker Threads	Lighter-weight threading for CPU-intensive tasks (Node.js >= 10.5.0)	CPU-intensive operations within a single process
Containers	Running multiple containerized instances (e.g., with Docker and Kubernetes)	Scalable, distributed applications in modern cloud environments

Advanced Load Balancing Strategies

While the Cluster module's default round-robin load balancing works well for many applications, you might need more sophisticated strategies for specific use cases.

1. Weighted Round-Robin

2. Least Connections

Performance Monitoring and Metrics

Monitoring your cluster's performance is crucial for maintaining a healthy application. Here's how to implement basic metrics collection:

Key Metrics to Monitor

Request Rate: Requests per second per worker
Error Rate: Error responses per second
Response Time: P50, P90, P99 response times
CPU Usage: Per-worker CPU utilization
Memory Usage: Heap and RSS memory per worker
Event Loop Lag: Delay in the event loop

Container Integration

When running in containerized environments like Docker and Kubernetes, consider these best practices:

1. Process Management

// Dockerfile example for a Node.js cluster app
FROM node:16-slim

WORKDIR /app
COPY package*.json ./
RUN npm install --production

# Copy application code
COPY . .

# Use the node process as PID 1 for proper signal handling
CMD ["node", "cluster.js"]

# Health check
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:8080/health || exit 1

2. Kubernetes Deployment

# k8s-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-cluster-app
spec:
  replicas: 3 # Number of pods
  selector:
    matchLabels:
      app: node-cluster
  template:
    metadata:
      labels:
        app: node-cluster
    spec:
      containers:
      - name: node-app
        image: your-image:latest
        ports:
          - containerPort: 8000
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
        limits:
          cpu: "1000m"
          memory: "1Gi"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
            initialDelaySeconds: 5
            periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8000
            initialDelaySeconds: 5
            periodSeconds: 10

Common Pitfalls and Solutions

1. Memory Leaks in Workers

Problem: Memory leaks in worker processes can cause gradual memory growth.

Solution: Implement worker recycling based on memory usage.

// In worker process
const MAX_MEMORY_MB = 500; // Max memory in MB before recycling

function checkMemory() {
  const memoryUsage = process.memoryUsage();
  const memoryMB = memoryUsage.heapUsed / 1024 / 1024;

  if (memoryMB > MAX_MEMORY_MB) {
    console.log(`Worker ${process.pid} memory ${memoryMB.toFixed(2)}MB exceeds limit, exiting...`);
    process.exit(1); // Let cluster restart the worker
  }
}

// Check memory every 30 seconds
setInterval(checkMemory, 30000);

2. Thundering Herd Problem

Problem: All workers accepting connections simultaneously after a restart.

Solution: Implement staggered startup.

// In master process
if (cluster.isMaster) {
  const numWorkers = require('os').cpus().length;

  function forkWorker(delay) {
    setTimeout(() => {
      const worker = cluster.fork();
      console.log(`Worker ${worker.process.pid} started after ${delay}ms delay`);
    }, delay);
  }

  // Stagger worker starts by 1 second
  for (let i = 0; i < numWorkers; i++) {
    forkWorker(i * 1000);
  }
}

3. Worker Starvation

Problem: Some workers get more load than others.

Solution: Implement proper load balancing and monitoring.

// Track request distribution
const requestDistribution = new Map();

// In master process
if (cluster.isMaster) {
  // ...

  // Monitor request distribution
  setInterval(() => {
    console.log('Request distribution:');
    requestDistribution.forEach((count, pid) => {
      console.log(` Worker ${pid}: ${count} requests`);
    });
  }, 60000);

  // Track requests per worker
  cluster.on('message', (worker, message) => {
    if (message.type === 'request_handled') {
      const count = requestDistribution.get(worker.process.pid) || 0;
      requestDistribution.set(worker.process.pid, count + 1);
    }
  });
}

Summary

The Node.js Cluster module provides an efficient way to scale your application across multiple CPU cores:

Creates a master process that manages multiple worker processes
Workers share the same server port, allowing load balancing
Improves application performance and resilience
Enables zero-downtime restarts and graceful shutdowns
Uses IPC for communication between master and workers

By understanding and properly implementing clustering, you can build high-performance, reliable Node.js applications that efficiently utilize all available CPU resources.

< Previous Next >

★ +1

Node.js Tutorial

Asynchronous

Module Basics

Core Modules

JS & TS Features

Building Applications

Database Integration

Advanced Communication

Testing & Debugging

Node.js Deployment

Perfomance & Scaling

Node.js Advanced

Hardware & IoT

Node.js Reference

Resources & Tools