Skip to main content

Overview

Agent Browser uses a client-daemon architecture that combines the performance of native code with the power of browser automation:
  1. Rust CLI - Fast native binary that parses commands and communicates with the daemon
  2. Node.js Daemon - Persistent background process that manages the Playwright browser instance
  3. Chromium Browser - Headless or headed browser controlled via Playwright
This architecture provides sub-millisecond command parsing overhead while maintaining a persistent browser session between commands.

Why This Architecture?

The dual-process design solves several key problems:
  • Fast Command Execution: The Rust CLI handles argument parsing in native code (< 1ms overhead)
  • Persistent Browser: The Node.js daemon keeps the browser alive between commands, avoiding slow startup
  • Clean Separation: The CLI handles user interaction while the daemon focuses on browser control
  • Fallback Support: If the native binary is unavailable, commands can fall back to Node.js

Communication Flow

IPC Transport

The CLI and daemon communicate via:
  • Unix Domain Sockets (macOS/Linux): ~/.agent-browser/{session}.sock
  • TCP Localhost (Windows): Port derived from session name (49152-65535 range)
Each message is a JSON object terminated by a newline (\n).

Daemon Lifecycle

Automatic Start

The daemon starts automatically on the first command:
agent-browser open example.com  # Starts daemon + launches browser
The CLI checks for a running daemon by reading the PID file at ~/.agent-browser/{session}.pid.

Session Persistence

The daemon persists between commands, keeping the browser session alive:
agent-browser open example.com   # Daemon starts, browser launches
agent-browser snapshot           # Reuses existing browser
agent-browser click @e1          # Still the same browser instance
This avoids the 2-3 second browser startup cost on every command.

Graceful Shutdown

The daemon shuts down on:
  • Explicit close command: agent-browser close
  • Process signals: SIGINT, SIGTERM, SIGHUP
  • Unexpected errors (cleanup before exit)
On shutdown, the daemon:
  1. Closes the browser
  2. Removes socket/PID files
  3. Auto-saves session state (if --session-name is set)

Session Isolation

Multiple isolated browser sessions can run concurrently using the --session flag:
agent-browser --session agent1 open site-a.com
agent-browser --session agent2 open site-b.com
Each session has its own:
  • Daemon process
  • Browser instance
  • Socket file (.sock or port)
  • PID file (.pid)
Sessions are isolated at the OS process level - they don’t share memory or state. See Sessions for details.

Command Serialization

The daemon processes commands serially (one at a time) to prevent race conditions:
// From daemon.ts:376-377
const commandQueue: string[] = [];
let processing = false;
When multiple commands arrive simultaneously, they’re queued and executed in order. This prevents:
  • Concurrent writes to the same socket
  • Buffer contention causing EAGAIN errors on the CLI side
  • Playwright state corruption from parallel operations

Backpressure Handling

The daemon uses backpressure-aware socket writes to prevent buffer overflow:
// From daemon.ts:29-63
export function safeWrite(socket: net.Socket, payload: string): Promise<void> {
  const canContinue = socket.write(payload);
  if (!canContinue) {
    // Wait for 'drain' event before continuing
  }
}
If the kernel’s send buffer is full, writes pause until the buffer drains. This prevents data loss when the CLI can’t read responses fast enough.

Security Model

Socket Permissions

Socket files are created with 0o700 permissions (owner-only access):
// From daemon.ts:330-333
if (!fs.existsSync(socketDir)) {
  fs.mkdirSync(socketDir, { recursive: true, mode: 0o700 });
}
This prevents other users from connecting to your browser session.

HTTP Request Detection

The daemon rejects HTTP requests to prevent cross-origin attacks:
// From daemon.ts:584-590
if (/^(GET|POST|PUT|DELETE|HEAD|OPTIONS|PATCH|CONNECT|TRACE)\s/i.test(trimmed)) {
  socket.destroy();
  return;
}
Legitimate clients send raw JSON (starts with {), while browsers send HTTP headers. This blocks malicious web pages from controlling your browser.

Browser Providers

The daemon supports multiple browser sources:

Local Browser (Default)

Launches Playwright’s bundled Chromium:
// From browser.ts:1390-1395
this.browser = await launcher.launch({
  headless: options.headless ?? true,
  executablePath: options.executablePath,
  args: baseArgs,
});

CDP Connection

Connects to an existing browser via Chrome DevTools Protocol:
// From browser.ts:1524
const browser = await chromium.connectOverCDP(cdpUrl);
Useful for:
  • Connecting to desktop Chrome/Chromium
  • Controlling Electron apps
  • Debugging mobile browsers via remote debugging

Cloud Providers

Connects to remote browser services (Browserbase, Browser Use, Kernel):
// From browser.ts:911-969 (Browserbase example)
const response = await fetch('https://api.browserbase.com/v1/sessions', {
  method: 'POST',
  headers: { 'X-BB-API-Key': browserbaseApiKey },
  body: JSON.stringify({ projectId: browserbaseProjectId }),
});
const browser = await chromium.connectOverCDP(session.connectUrl);
This enables serverless deployment without bundling a full browser.

Platform Support

PlatformArchitectureDaemon IPCNative Binary
macOSARM64Unix Socket
macOSx64Unix Socket
LinuxARM64Unix Socket
Linuxx64Unix Socket
Windowsx64TCP Localhost
All platforms fall back to Node.js if the native binary is unavailable.

Environment Configuration

The daemon reads configuration from environment variables:
VariablePurpose
AGENT_BROWSER_SESSIONSession name (default: default)
AGENT_BROWSER_SOCKET_DIRSocket directory override
AGENT_BROWSER_DAEMONForce daemon mode (1 to enable)
AGENT_BROWSER_DEFAULT_TIMEOUTPlaywright timeout in ms (default: 25000)
AGENT_BROWSER_DEBUGEnable debug logging
See daemon.ts and browser.ts for the complete list.

Performance Characteristics

Command Overhead

  • Native binary: < 1ms parsing + IPC round-trip
  • npx fallback: 100-500ms Node.js startup overhead
  • Daemon IPC: ~1-2ms round-trip (Unix socket) or ~2-5ms (TCP localhost)

Browser Startup

  • First command: 2-3 seconds to launch Chromium
  • Subsequent commands: 0ms (browser already running)

Memory Usage

  • Rust CLI: ~1-2 MB
  • Node.js daemon: ~50-100 MB
  • Chromium browser: ~200-500 MB (depends on page complexity)

Error Handling

Stale Daemon Detection

The CLI checks if the PID in the PID file is still alive:
// From main.rs (conceptual - actual code in daemon.ts:260-274)
process.kill(pid, 0);  // Signal 0 = existence check
// ESRCH = process doesn't exist (stale)
// EPERM = process exists but we can't signal it
Stale sockets/PIDs are automatically cleaned up.

Timeout Handling

Playwright operations timeout after 25 seconds by default:
// From browser.ts:39-48
export function getDefaultTimeout(): number {
  return process.env.AGENT_BROWSER_DEFAULT_TIMEOUT
    ? parseInt(process.env.AGENT_BROWSER_DEFAULT_TIMEOUT, 10)
    : 25000;
}
This is below the CLI’s 30-second read timeout to ensure clean error messages instead of EAGAIN failures.

Next Steps

  • Snapshot Refs - Learn how element references work
  • Sessions - Understand session isolation and persistence
  • Selectors - Master element selection strategies

Build docs developers (and LLMs) love