AI Coding Assistants

Agent-browser integrates seamlessly with AI coding assistants and agents to automate browser tasks.

Quick Start

The simplest approach is to just tell your agent to use it:

Use agent-browser to test the login flow. Run agent-browser --help to see available commands.

The --help output is comprehensive and most agents can figure it out from there.

AI Coding Assistant Integration (Recommended)

Add the agent-browser skill to your AI coding assistant for richer context:

npx skills add vercel-labs/agent-browser

This works with:

Claude Code
Codex
Cursor
Gemini CLI
GitHub Copilot
Goose
OpenCode
Windsurf

The skill is fetched from the repository, so it stays up to date automatically. Do not copy SKILL.md from node_modules as it will become stale.

Claude Code Setup

Install the skill

In your project directory:

npx skills add vercel-labs/agent-browser

This adds the skill to .claude/skills/agent-browser/SKILL.md.

Use with Claude Code

The skill teaches Claude Code the full agent-browser workflow, including:

Snapshot-ref interaction pattern
Session management
Timeout handling
Security best practices

Just ask Claude Code to automate browser tasks:

Test the signup flow on https://example.com/signup

AGENTS.md / CLAUDE.md Instructions

For more consistent results, add agent-browser instructions to your project or global instructions file:

## Browser Automation

Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.

Core workflow:
1. `agent-browser open <url>` - Navigate to page
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
4. Re-snapshot after page changes

Core Workflow for Agents

Every browser automation follows this pattern:

Navigate

agent-browser open <url>

Get snapshot with refs

agent-browser snapshot -i

Output includes refs like @e1, @e2, @e3 for each interactive element.

Interact using refs

agent-browser click @e1
agent-browser fill @e2 "text"
agent-browser click @e3

Re-snapshot after changes

agent-browser snapshot -i  # Get fresh refs after navigation

Essential Commands for AI Agents

agent-browser open <url>              # Navigate to URL
agent-browser close                   # Close browser

Snapshot with Refs

agent-browser snapshot -i             # Interactive elements with refs
agent-browser snapshot -i --json      # JSON output for parsing
agent-browser snapshot -i -C          # Include cursor-interactive (divs with onclick)

Interaction

agent-browser click @e1               # Click element by ref
agent-browser fill @e2 "text"         # Fill input by ref
agent-browser type @e3 "text"         # Type without clearing
agent-browser press Enter             # Press key
agent-browser select @e4 "option"     # Select dropdown option

Getting Information

agent-browser get text @e1            # Get element text
agent-browser get url                 # Get current URL
agent-browser get title               # Get page title

Waiting

agent-browser wait @e1                # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page"    # Wait for URL pattern
agent-browser wait 2000               # Wait milliseconds

Screenshots

agent-browser screenshot              # Screenshot to temp dir
agent-browser screenshot --annotate   # With numbered labels
agent-browser screenshot --full       # Full page

JSON Output Mode

Use --json for machine-readable output:

agent-browser snapshot --json
# Returns: {"success":true,"data":{"snapshot":"...","refs":{"e1":{...}}}}

agent-browser get text @e1 --json
agent-browser is visible @e2 --json

Command Chaining

Commands can be chained with && for efficiency:

# Open, wait, and snapshot in one call
agent-browser open example.com && agent-browser wait --load networkidle && agent-browser snapshot -i

# Chain interactions
agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "pass" && agent-browser click @e3

Use && when you don’t need intermediate output. Run separately when you need to parse output first.

Example: Form Automation

# Navigate and get snapshot
agent-browser open https://example.com/form
agent-browser snapshot -i

# Output:
# - textbox "Name" [ref=e1]
# - textbox "Email" [ref=e2]
# - combobox "Country" [ref=e3]
# - checkbox "Subscribe" [ref=e4]
# - button "Submit" [ref=e5]

# Fill form using refs
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "United States"
agent-browser check @e4
agent-browser click @e5

# Wait for redirect and verify
agent-browser wait --url "**/success"
agent-browser snapshot -i

Example: Data Extraction

# Navigate and snapshot
agent-browser open https://example.com/products
agent-browser snapshot -i --json > snapshot.json

# Extract specific element text
agent-browser get text @e1 --json
agent-browser get text @e2 --json

# Clean up
agent-browser close

Example: Authenticated Workflow

# Save credentials once (encrypted)
echo "password" | agent-browser auth save myapp \
  --url https://app.example.com/login \
  --username user@example.com \
  --password-stdin

# Login using saved credentials
agent-browser auth login myapp

# Now authenticated - continue workflow
agent-browser open https://app.example.com/dashboard
agent-browser snapshot -i

Important: Ref Lifecycle

Refs are invalidated when the page changes. Always re-snapshot after:

Clicking links or buttons that navigate
Form submissions
Dynamic content loading

agent-browser click @e5              # Navigates to new page
agent-browser snapshot -i            # MUST re-snapshot
agent-browser click @e1              # Use new refs

Session Management

Use named sessions for parallel agent operations:

# Agent 1
agent-browser --session agent1 open site-a.com
agent-browser --session agent1 snapshot -i

# Agent 2 (parallel)
agent-browser --session agent2 open site-b.com
agent-browser --session agent2 snapshot -i

# Clean up
agent-browser --session agent1 close
agent-browser --session agent2 close

Timeouts and Slow Pages

For slow websites, use explicit waits:

# Wait for network to settle
agent-browser wait --load networkidle

# Wait for specific element
agent-browser wait @e1

# Wait for URL pattern
agent-browser wait --url "**/dashboard"

Error Handling

Agent-browser returns non-zero exit codes on errors. Check command success:

if agent-browser click @e1; then
  echo "Click successful"
else
  echo "Click failed"
fi

Security for AI Agents

Enable security features for production deployments:

# Content boundaries (helps LLMs distinguish page content)
export AGENT_BROWSER_CONTENT_BOUNDARIES=1

# Domain allowlist (restrict navigation)
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"

# Output limits (prevent context flooding)
export AGENT_BROWSER_MAX_OUTPUT=50000

# Action policy (gate destructive actions)
export AGENT_BROWSER_ACTION_POLICY=./policy.json

Best Practices

Always close sessions

Clean up browser sessions when done to avoid leaked processes:

agent-browser close

Use refs, not CSS selectors

Refs from snapshots are more reliable than CSS selectors for AI agents.

Re-snapshot after navigation

Always take a fresh snapshot after the page changes.

Use --json for parsing

JSON output is easier for agents to parse programmatically.

Chain commands when possible

Use && to chain commands that don’t need intermediate parsing.

Full Reference

For complete command reference and advanced features:

Get Started

Core Concepts

Commands

Security

Advanced

Integrations

Guides

Quick Start

AI Coding Assistant Integration (Recommended)

Claude Code Setup

AGENTS.md / CLAUDE.md Instructions

Core Workflow for Agents

Essential Commands for AI Agents

Navigation

Snapshot with Refs

Interaction

Getting Information

Waiting

Screenshots

JSON Output Mode

Command Chaining

Example: Form Automation

Example: Data Extraction

Example: Authenticated Workflow

Important: Ref Lifecycle

Session Management

Timeouts and Slow Pages

Error Handling

Security for AI Agents

Best Practices

Always close sessions

Use refs, not CSS selectors

Re-snapshot after navigation

Use --json for parsing

Chain commands when possible

Full Reference

Build docs developers (and LLMs) love

Get Started

Core Concepts

Commands

Security

Advanced

Integrations

Guides

Documentation Index

​Quick Start

​AI Coding Assistant Integration (Recommended)

​Claude Code Setup

​AGENTS.md / CLAUDE.md Instructions

​Core Workflow for Agents

​Essential Commands for AI Agents

​Navigation

​Snapshot with Refs

​Interaction

​Getting Information

​Waiting

​Screenshots

​JSON Output Mode

​Command Chaining

​Example: Form Automation

​Example: Data Extraction

​Example: Authenticated Workflow

​Important: Ref Lifecycle

​Session Management

​Timeouts and Slow Pages

​Error Handling

​Security for AI Agents

​Best Practices

Always close sessions

Use refs, not CSS selectors

Re-snapshot after navigation

Use --json for parsing

Chain commands when possible

​Full Reference

Build docs developers (and LLMs) love

Quick Start

AI Coding Assistant Integration (Recommended)

Claude Code Setup

AGENTS.md / CLAUDE.md Instructions

Core Workflow for Agents

Essential Commands for AI Agents

Navigation

Snapshot with Refs

Interaction

Getting Information

Waiting

Screenshots

JSON Output Mode

Command Chaining

Example: Form Automation

Example: Data Extraction

Example: Authenticated Workflow

Important: Ref Lifecycle

Session Management

Timeouts and Slow Pages

Error Handling

Security for AI Agents

Best Practices

Full Reference