Quick Start
The simplest approach is to just tell your agent to use it:--help output is comprehensive and most agents can figure it out from there.
AI Coding Assistant Integration (Recommended)
Add the agent-browser skill to your AI coding assistant for richer context:- Claude Code
- Codex
- Cursor
- Gemini CLI
- GitHub Copilot
- Goose
- OpenCode
- Windsurf
SKILL.md from node_modules as it will become stale.
Claude Code Setup
Install the skill
In your project directory:This adds the skill to
.claude/skills/agent-browser/SKILL.md.AGENTS.md / CLAUDE.md Instructions
For more consistent results, add agent-browser instructions to your project or global instructions file:Core Workflow for Agents
Every browser automation follows this pattern:Essential Commands for AI Agents
Navigation
Snapshot with Refs
Interaction
Getting Information
Waiting
Screenshots
JSON Output Mode
Use--json for machine-readable output:
Command Chaining
Commands can be chained with&& for efficiency:
&& when you don’t need intermediate output. Run separately when you need to parse output first.
Example: Form Automation
Example: Data Extraction
Example: Authenticated Workflow
Important: Ref Lifecycle
Refs are invalidated when the page changes. Always re-snapshot after:- Clicking links or buttons that navigate
- Form submissions
- Dynamic content loading
Session Management
Use named sessions for parallel agent operations:Timeouts and Slow Pages
For slow websites, use explicit waits:Error Handling
Agent-browser returns non-zero exit codes on errors. Check command success:Security for AI Agents
Enable security features for production deployments:Best Practices
Always close sessions
Clean up browser sessions when done to avoid leaked processes:
Use refs, not CSS selectors
Refs from snapshots are more reliable than CSS selectors for AI agents.
Re-snapshot after navigation
Always take a fresh snapshot after the page changes.
Use --json for parsing
JSON output is easier for agents to parse programmatically.
Chain commands when possible
Use
&& to chain commands that don’t need intermediate parsing.