Signaling Server

Overview

The signaling server coordinates WebRTC connections between Windows hosts and browser clients. It uses WebSocket for real-time communication and Redis pub/sub for multi-instance coordination. Key features:

WebSocket signaling for SDP and ICE candidate exchange
Redis pub/sub for scaling across multiple instances
Rate limiting and circuit breakers
Health checks and Prometheus metrics
JWT authentication (optional)

Requirements

Node.js 16+ and npm
Redis 6+ (required for production, single instance OK for development)
1-2 GB RAM per instance
WebSocket-capable hosting (Railway, Heroku, AWS, GCP, etc.)

Installation

Navigate to the Server directory

cd Server

Install dependencies

npm install

This installs:

ws - WebSocket server
redis - Redis client with pub/sub support
express - HTTP server for health checks
pino - Structured logging
zod - Schema validation
jsonwebtoken / jose - JWT authentication (optional)

Configuration

Environment Variables

Create a .env file in the Server/ directory:

# Server Configuration
NODE_ENV=development
WS_PORT=3002

# Redis Connection
REDIS_URL=redis://127.0.0.1:6379

# Room Settings
ROOM_CAPACITY=2
ROOM_TTL_SECONDS=120
ROOM_ID_MAX_LENGTH=64

# WebSocket Settings
MESSAGE_MAX_BYTES=262144
BACKPRESSURE_CLOSE_THRESHOLD_BYTES=5242880
HEARTBEAT_INTERVAL_MS=30000

# Rate Limiting
RATE_LIMIT_CONN_PER_10S=20
RATE_LIMIT_MESSAGES_PER_10S=200
RATE_LIMIT_IP_MSGS_PER_10S=400
RATE_LIMIT_ROOM_MSGS_PER_10S=800

# Circuit Breaker
CB_ERROR_THRESHOLD=5
CB_OPEN_MS=10000

# Logging
LOG_LEVEL=info
PRETTY_LOGS=true

Configuration Reference

Core Settings

Variable	Type	Description	Default
`NODE_ENV`	string	Environment mode	`development`
`WS_PORT`	number	WebSocket port (local dev only)	`3002`
`PORT`	number	HTTP/WS port (auto-set by hosting platforms)	-
`HEALTH_PORT`	number	Health check endpoint port	`8080`
`REDIS_URL`	string	Redis connection URL	`redis://127.0.0.1:6379`

Railway, Heroku, and similar platforms automatically inject the PORT environment variable. The server listens on PORT if available, falling back to WS_PORT.

Room Settings

Variable	Type	Description	Default
`ROOM_CAPACITY`	number	Max clients per room	`2`
`ROOM_TTL_SECONDS`	number	Room expiration time after last member leaves	`120`
`ROOM_ID_MAX_LENGTH`	number	Maximum room ID length	`64`

WebSocket Settings

Variable	Type	Description	Default
`MESSAGE_MAX_BYTES`	number	Max message size (bytes)	`262144` (256 KB)
`BACKPRESSURE_CLOSE_THRESHOLD_BYTES`	number	Close connection if buffer exceeds this	`5242880` (5 MB)
`HEARTBEAT_INTERVAL_MS`	number	Ping interval to detect dead connections	`30000` (30s)

Rate Limiting

Variable	Type	Description	Default
`RATE_LIMIT_CONN_PER_10S`	number	Max new connections per IP per 10s	`20`
`RATE_LIMIT_MESSAGES_PER_10S`	number	Max messages per client per 10s	`200`
`RATE_LIMIT_IP_MSGS_PER_10S`	number	Max messages per IP per 10s	`400`
`RATE_LIMIT_ROOM_MSGS_PER_10S`	number	Max messages per room per 10s	`800`

Security

Variable	Type	Description	Default
`REQUIRE_WSS`	boolean	Reject non-WSS connections in production	`false`
`ALLOWED_ORIGINS`	string	Comma-separated list of allowed origins	“ (all)
`SUBPROTOCOL`	string	Required WebSocket subprotocol	-

Running Locally

Start Redis

redis-server

Or with Docker:

docker run -d -p 6379:6379 redis:7-alpine

Start the signaling server

# Production mode
npm start

# Development mode (pretty logs)
npm run dev

The server will start on port 3002 (or the port specified by PORT env var).

Verify it's running

# Health check
curl http://localhost:3002/healthz
# Should return: ok

# Readiness check (verifies Redis connection)
curl http://localhost:3002/readyz
# Should return: ready

# Metrics (Prometheus format)
curl http://localhost:3002/metrics

Testing WebSocket Connection

Test the WebSocket endpoint:

const ws = new WebSocket('ws://localhost:3002?roomId=test-room');

ws.onopen = () => console.log('Connected');
ws.onmessage = (event) => console.log('Message:', event.data);
ws.onerror = (error) => console.error('Error:', error);

Deploying to Production

Railway Deployment

Create a new Railway project

# Install Railway CLI
npm install -g @railway/cli

# Login
railway login

# Initialize project
railway init

Add Redis service

In the Railway dashboard:

New → Database → Redis

Railway will automatically inject REDIS_URL as an environment variable.

Configure environment variables

In Railway dashboard, add these variables:

NODE_ENV=production
REQUIRE_WSS=true
ALLOWED_ORIGINS=https://yourgame.com
LOG_LEVEL=warn
ROOM_CAPACITY=2

Deploy

cd Server
railway up

Railway will:

Detect the Node.js application
Run npm install
Execute npm start
Expose the service on a public URL

Note the deployment URL

Railway provides a URL like:

https://signaling-server-production-abc123.up.railway.app

Update your SIGNALING_PUBLIC_URL in the matchmaker and host config:

wss://signaling-server-production-abc123.up.railway.app

Docker Deployment

Create a Dockerfile in the Server/ directory:

FROM node:18-alpine

WORKDIR /app

# Copy package files
COPY package*.json ./

# Install production dependencies
RUN npm ci --only=production

# Copy application code
COPY . .

# Expose port (will be overridden by PORT env var)
EXPOSE 3002

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s \
  CMD node -e "require('http').get('http://localhost:'+process.env.PORT+'/healthz', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"

# Start server
CMD ["npm", "start"]

Build and run:

# Build
docker build -t cloudgaming-signaling .

# Run
docker run -d \
  -p 3002:3002 \
  -e REDIS_URL=redis://host.docker.internal:6379 \
  -e NODE_ENV=production \
  cloudgaming-signaling

Kubernetes / Docker Compose

Docker Compose
Kubernetes

version: '3.8'

services:
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    command: redis-server --appendonly yes

  signaling:
    build: ./Server
    ports:
      - "3002:3002"
    environment:
      NODE_ENV: production
      REDIS_URL: redis://redis:6379
      ROOM_CAPACITY: 2
      LOG_LEVEL: info
    depends_on:
      - redis
    restart: unless-stopped

volumes:
  redis-data:

Run with:

docker-compose up -d

apiVersion: apps/v1
kind: Deployment
metadata:
  name: signaling-server
spec:
  replicas: 2
  selector:
    matchLabels:
      app: signaling
  template:
    metadata:
      labels:
        app: signaling
    spec:
      containers:
      - name: signaling
        image: cloudgaming-signaling:latest
        ports:
        - containerPort: 3002
        env:
        - name: NODE_ENV
          value: "production"
        - name: REDIS_URL
          valueFrom:
            secretKeyRef:
              name: redis-credentials
              key: url
        - name: ROOM_CAPACITY
          value: "2"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 3002
          initialDelaySeconds: 10
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /readyz
            port: 3002
          initialDelaySeconds: 5
          periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: signaling-service
spec:
  selector:
    app: signaling
  ports:
  - protocol: TCP
    port: 3002
    targetPort: 3002
  type: LoadBalancer

Horizontal Scaling

The signaling server supports horizontal scaling via Redis pub/sub:

┌─────────┐     ┌─────────┐     ┌─────────┐
│ Client  │────▶│  LB     │────▶│Instance1│
└─────────┘     └─────────┘  │  └─────────┘
                              │       │
                              │  ┌────▼─────┐
                              │  │  Redis   │
                              │  │ Pub/Sub  │
                              │  └────▲─────┘
                              │       │
                              └──▶┌─────────┐
                                  │Instance2│
                                  └─────────┘

How it works:

Each instance maintains local WebSocket connections
Messages are published to Redis channel room:{roomId}
All instances subscribed to that channel receive and forward messages
Local room state is synchronized via Redis atomic operations

No sticky sessions required. Clients can connect to any instance.

Load Balancing

For multiple instances, use a load balancer with WebSocket support:

NGINX: Enable proxy_http_version 1.1 and upgrade headers
HAProxy: Use option http-server-close
AWS ALB: Enable WebSocket support in target group
Railway: Automatically load balances with 0 configuration

Monitoring

Prometheus Metrics

The server exposes metrics at /metrics:

# Active WebSocket connections
cloudgaming_active_connections 42

# Local rooms on this instance
cloudgaming_local_rooms 21

# Redis connection status (1=up, 0=down)
cloudgaming_redis_up 1

# Circuit breaker status (1=open, 0=closed)
cloudgaming_circuit_breaker_open 0

# Message counters
cloudgaming_messages_forwarded_total 12543
cloudgaming_schema_rejects_total 7
cloudgaming_rate_limit_drops_total 23

# Operation latency histograms
cloudgaming_redis_operation_duration_seconds{...}
cloudgaming_fanout_duration_seconds{...}

Health Checks

Liveness: GET /healthz - Always returns 200 if server is running
Readiness: GET /readyz - Returns 200 only if Redis is connected and not draining

Logging

The server uses structured JSON logging (Pino):

{
  "level": "info",
  "time": 1234567890,
  "msg": "Client joined room",
  "clientId": "client:abc-123",
  "roomId": "room-456",
  "localCount": 2
}

Set PRETTY_LOGS=true for human-readable output during development.

Troubleshooting

Error: Redis connection failed

Symptoms: Server crashes or readiness check failsSolutions:

Verify Redis is running: redis-cli ping
Check REDIS_URL format: redis://[user:password@]host:port[/db]
Verify network connectivity (firewall, security groups)
For Railway: Ensure Redis service is in the same project

Error: EADDRINUSE (port already in use)

Solution: Change the port or kill the process using it

# Find process
lsof -i :3002

# Kill it
kill -9 <PID>

# Or use a different port
export WS_PORT=3003
npm start

WebSocket connection fails with 426 Upgrade Required

Cause: Client is making HTTP request instead of WebSocket upgradeSolution: Ensure client uses ws:// or wss:// protocol:

new WebSocket('ws://localhost:3002?roomId=test')

Connection rejected: Origin not allowed

Cause: ALLOWED_ORIGINS is configured but client origin doesn’t matchSolution: Add client origin to allowed list:

ALLOWED_ORIGINS=https://game.com,https://www.game.com,http://localhost:8080

High memory usage with many connections

Solutions:

Lower BACKPRESSURE_CLOSE_THRESHOLD_BYTES to close slow clients faster
Enable rate limiting with stricter limits
Scale horizontally with more instances
Monitor and close inactive connections

Messages not reaching clients on different instances

Cause: Redis pub/sub not workingDiagnosis:

# Monitor Redis pub/sub
redis-cli
PSUBSCRIBE room:*
# Should see messages when clients communicate

Solutions:

Verify all instances use the same REDIS_URL
Check Redis logs for errors
Ensure Redis allows pub/sub (not in cluster mode with restrictions)

Security Best Practices

For production deployments:

Always use WSS (secure WebSocket): REQUIRE_WSS=true
Restrict origins: ALLOWED_ORIGINS=https://yourgame.com
Enable JWT authentication for sensitive applications
Use managed Redis with authentication and encryption
Monitor rate limit violations
Set up alerts for circuit breaker openings
Keep dependencies updated: npm audit

Next Steps

Matchmaker Deployment - Set up host registration and matching
Host Setup - Configure Windows hosts to connect to this signaling server
Client Deployment - Deploy the browser client

Getting Started

Deployment

Components

Configuration

Operations

Overview

Requirements

Installation

Configuration

Environment Variables

Configuration Reference

Core Settings

Room Settings

WebSocket Settings

Rate Limiting

Security

Running Locally

Testing WebSocket Connection

Deploying to Production

Railway Deployment

Docker Deployment

Kubernetes / Docker Compose

Horizontal Scaling

Load Balancing

Monitoring

Prometheus Metrics

Health Checks

Logging

Troubleshooting

Security Best Practices

Next Steps

Build docs developers (and LLMs) love

Getting Started

Deployment

Components

Configuration

Operations

Documentation Index

​Overview

​Requirements

​Installation

​Configuration

​Environment Variables

​Configuration Reference

​Core Settings

​Room Settings

​WebSocket Settings

​Rate Limiting

​Security

​Running Locally

​Testing WebSocket Connection

​Deploying to Production

​Railway Deployment

​Docker Deployment

​Kubernetes / Docker Compose

​Horizontal Scaling

​Load Balancing

​Monitoring

​Prometheus Metrics

​Health Checks

​Logging

​Troubleshooting

​Security Best Practices

​Next Steps

Build docs developers (and LLMs) love

Overview

Requirements

Installation

Configuration

Environment Variables

Configuration Reference

Core Settings

Room Settings

WebSocket Settings

Rate Limiting

Security

Running Locally

Testing WebSocket Connection

Deploying to Production

Railway Deployment

Docker Deployment

Kubernetes / Docker Compose

Horizontal Scaling

Load Balancing

Monitoring

Prometheus Metrics

Health Checks

Logging

Troubleshooting

Security Best Practices

Next Steps