Selectors - Agent Browser

Overview

Agent Browser supports multiple selector strategies for finding and interacting with elements:

Refs (@e1) - Recommended for AI agents
CSS Selectors (#id, .class) - Traditional web selectors
Semantic Locators (find role button) - Human-readable element queries
Text Selectors (text=Submit) - Find by visible text
XPath (xpath=//button) - XML path queries

Each has different tradeoffs for stability, readability, and performance.

Refs (Recommended for AI)

Overview

Refs provide deterministic element selection from snapshots:

# Get snapshot
agent-browser snapshot -i
# Output:
# - button "Submit" [ref=e1]
# - textbox "Email" [ref=e2]

# Use refs
agent-browser click @e1
agent-browser fill @e2 "user@example.com"

Why Use Refs?

✓ Deterministic: Points to exact element from snapshot
✓ Fast: No DOM re-query needed
✓ AI-Friendly: Easy for LLMs to generate
✓ Accessible: Based on ARIA tree (screen reader compatible)

Ref Formats

Three equivalent formats:

agent-browser click @e1      # Recommended
agent-browser click ref=e1   # Alternative
agent-browser click e1       # Bare format

All three parse to the same reference:

// From snapshot.ts:605-615
export function parseRef(arg: string): string | null {
  if (arg.startsWith('@')) return arg.slice(1);
  if (arg.startsWith('ref=')) return arg.slice(4);
  if (/^e\\d+$/.test(arg)) return arg;
  return null;
}

How Refs Resolve

Refs are mapped to Playwright locators:

// From browser.ts:220-240
const refData = this.refMap[ref];
let locator = page.getByRole(refData.role, {
  name: refData.name,
  exact: true,
});

if (refData.nth !== undefined) {
  locator = locator.nth(refData.nth);
}

Example:

// Ref map:
refs = {
  "e1": {
    role: "button",
    name: "Submit",
    nth: 0  // First submit button
  }
}

// Resolves to:
page.getByRole('button', { name: 'Submit', exact: true }).nth(0)

Ref Scoping

Refs are scoped to a single snapshot. After navigation or page changes, get a fresh snapshot:

agent-browser snapshot -i      # Snapshot 1: refs e1, e2, e3
agent-browser click @e1        # Use ref from snapshot 1
agent-browser snapshot -i      # Snapshot 2: NEW refs e1, e2, e3
agent-browser fill @e2 "text" # Use ref from snapshot 2

Using stale refs may fail or interact with the wrong element. See Snapshot Refs for details.

CSS Selectors

Basic CSS

Standard CSS selector syntax:

# ID selector
agent-browser click "#submit-button"

# Class selector
agent-browser click ".btn-primary"

# Attribute selector
agent-browser click "[data-testid='login-button']"

# Combinator
agent-browser click "form > button"

# Pseudo-class
agent-browser click "button:first-child"

When to Use CSS

✓ Stable IDs: When elements have unique, stable IDs
✓ Test IDs: When using data-testid attributes
✓ Simple queries: For one-off scripts ✗ Dynamic classes: Avoid if classes change frequently
✗ AI workflows: Hard for LLMs to generate correctly

Performance

CSS selectors are fast (direct DOM query), but may be brittle:

# Fast but brittle
agent-browser click ".submit-btn.primary.large"

# Better - use stable attributes
agent-browser click "[data-testid='submit-button']"

Semantic Locators

Overview

Find elements by their semantic meaning instead of DOM structure:

agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "user@example.com"
agent-browser find text "Sign In" click

Role Locators

Find by ARIA role and optional name:

# By role only
agent-browser find role button click

# By role and name
agent-browser find role button click --name "Submit"

# Exact match
agent-browser find role link click --name "Home" --exact

Supported roles:

Role	Element Examples
`button`	`<button>`, `<input type="button">`, `role="button"`
`link`	`<a href="...">`
`textbox`	`<input type="text">`, `<textarea>`
`checkbox`	`<input type="checkbox">`
`radio`	`<input type="radio">`
`combobox`	`<select>`, ARIA comboboxes
`heading`	`<h1>`, `<h2>`, etc.

See the ARIA roles spec for the full list.

Text Locators

Find by visible text content:

# Contains text
agent-browser find text "Sign In" click

# Exact text match
agent-browser find text "Submit" click --exact

Text matching is case-sensitive by default. Use --exact for strict matching (no substring matches).

Label Locators

Find inputs by their associated label:

agent-browser find label "Email" fill "user@example.com"
agent-browser find label "Password" fill "secret123"

This works for:

<label for="email">Email</label><input id="email">
<label>Email <input></label>
aria-label="Email" attributes
aria-labelledby references

Placeholder Locators

Find inputs by placeholder text:

agent-browser find placeholder "Enter your email" fill "user@example.com"

Alt Text Locators

Find images by alt text:

agent-browser find alt "Company logo" click

Title Locators

Find elements by title attribute:

agent-browser find title "Click to expand" click

Test ID Locators

Find by data-testid attribute:

agent-browser find testid "login-button" click

Positional Locators

Select specific instances when multiple elements match:

# First match
agent-browser find first "button" click

# Last match
agent-browser find last "button" click

# Nth match (0-indexed)
agent-browser find nth 2 "button" click  # Third button

Actions

Semantic locators support these actions:

Action	Description	Example
`click`	Click element	`find role button click`
`fill`	Fill input	`find label "Email" fill "user@example.com"`
`type`	Type into input	`find label "Search" type "query"`
`hover`	Hover element	`find role link hover`
`focus`	Focus element	`find role textbox focus`
`check`	Check checkbox	`find role checkbox check`
`uncheck`	Uncheck checkbox	`find role checkbox uncheck`
`text`	Get text content	`find role heading text`

When to Use Semantic Locators

✓ Human-readable: Easy to understand what they select
✓ Stable: Less affected by DOM changes
✓ Accessible: Based on ARIA/semantic HTML ✗ Verbose: Longer than refs
✗ Slower: Requires DOM query on each use

Text Selectors

Exact Text Match

agent-browser click "text=Submit"
agent-browser click "text='Submit Form'"  # Exact match

Substring Match

agent-browser click "text=/.*Submit.*/"  # Regex

Case-Insensitive

agent-browser click "text=/submit/i"  # Case-insensitive

XPath Selectors

Basic XPath

# Absolute path
agent-browser click "xpath=/html/body/div/button"

# Relative path
agent-browser click "xpath=//button[@id='submit']"

# By text
agent-browser click "xpath=//button[text()='Submit']"

# By attribute
agent-browser click "xpath=//button[@data-testid='login']"

When to Use XPath

✓ Complex queries: When CSS can’t express the logic
✓ Text-based selection: XPath has better text functions ✗ Readability: Hard to read and maintain
✗ Performance: Generally slower than CSS

Selector Precedence

When a selector could match multiple strategies, Agent Browser checks in this order:

Ref: @e1, ref=e1, e1 (if matches /^e\d+$/)
Explicit prefix: text=, xpath=
CSS: Anything else

# Interpreted as ref
agent-browser click @e1

# Interpreted as text selector
agent-browser click "text=Submit"

# Interpreted as XPath
agent-browser click "xpath=//button"

# Interpreted as CSS
agent-browser click "#submit"

Selector Composition

Combine selectors for more precise targeting:

CSS + Pseudo-Selectors

# First button in a form
agent-browser click "form button:first-child"

# Last item in a list
agent-browser click "ul li:last-child"

# Nth child
agent-browser click "table tr:nth-child(3)"

Chaining Find Commands

# Click the submit button in the login form
agent-browser find role "button" click --name "Submit"

# Fill the email field in the registration section
agent-browser find label "Email" fill "user@example.com"

Special Selectors

Visible Elements Only

Playwright automatically filters to visible elements:

# Only clicks visible buttons
agent-browser click "button"

To include hidden elements, use --force:

agent-browser click "button" --force

Detached Elements

Playwright waits for elements to be attached to the DOM:

# Waits for button to exist before clicking
agent-browser click "button"

Timeout is 25 seconds by default (configurable via AGENT_BROWSER_DEFAULT_TIMEOUT).

Selector Best Practices

For AI Agents

Use refs:

# Good - deterministic and fast
agent-browser snapshot -i
agent-browser click @e1

# Avoid - brittle and hard for AI to generate
agent-browser click "div.container > form#login > div.actions > button:nth-child(2)"

For Manual Scripting

Use semantic locators or stable CSS:

# Good - semantic and readable
agent-browser find role button click --name "Submit"

# Good - stable test ID
agent-browser click "[data-testid='submit-button']"

# Avoid - brittle classes
agent-browser click ".btn.btn-primary.btn-lg.submit"

For Testing

Use data-testid attributes:

<button data-testid="login-button">Log In</button>

agent-browser click "[data-testid='login-button']"

Test IDs are stable across UI changes.

Performance Comparison

Selector Type	Speed	Stability	AI-Friendly
Refs	⚡⚡⚡ (cached)	⭐⭐⭐	⭐⭐⭐
CSS (ID)	⚡⚡⚡	⭐⭐	⭐
CSS (class)	⚡⚡⚡	⭐	⭐
CSS (data-testid)	⚡⚡⚡	⭐⭐⭐	⭐⭐
Semantic (role)	⚡⚡	⭐⭐⭐	⭐⭐
Text	⚡⚡	⭐⭐	⭐⭐
XPath	⚡	⭐	⭐

Debugging Selectors

Highlight Elements

Highlight an element to verify your selector:

agent-browser highlight "#submit-button"

The element will be outlined in red in the browser.

Count Matches

Check how many elements match a selector:

agent-browser get count "button"
# Output: 5

If count > 1, your selector is ambiguous and may click the wrong element.

Snapshot Preview

Use snapshots to see what elements are available:

agent-browser snapshot -i

This shows all interactive elements with their refs and roles.

Advanced Techniques

Shadow DOM

Penetrate shadow DOM boundaries:

# Playwright auto-pierces shadow DOM
agent-browser click "button"  # Works even inside shadow roots

Iframes

Switch to iframe before selecting:

# Switch to iframe
agent-browser frame "#payment-iframe"

# Select inside iframe
agent-browser click "#submit"

# Switch back to main frame
agent-browser frame main

Dynamic Content

Wait for elements to appear:

# Wait for element
agent-browser wait "#results"

# Then interact
agent-browser click "#results button"

Or use find which auto-waits:

# Auto-waits for button to appear
agent-browser find role button click --name "Load More"

Next Steps

Architecture - Understand the Rust CLI + Node.js daemon design
Snapshot Refs - Deep dive into the ref system
Sessions - Learn about session isolation and persistence

Get Started

Core Concepts

Commands

Security

Advanced

Integrations

Guides

Documentation Index

​Overview

​Refs (Recommended for AI)

​Overview

​Why Use Refs?

​Ref Formats

​How Refs Resolve

​Ref Scoping

​CSS Selectors

​Basic CSS

​When to Use CSS

​Performance

​Semantic Locators

​Overview

​Role Locators

​Text Locators

​Label Locators

​Placeholder Locators

​Alt Text Locators

​Title Locators

​Test ID Locators

​Positional Locators

​Actions

​When to Use Semantic Locators

​Text Selectors

​Exact Text Match

​Substring Match

​Case-Insensitive

​XPath Selectors

​Basic XPath

​When to Use XPath

​Selector Precedence

​Selector Composition

​CSS + Pseudo-Selectors

​Chaining Find Commands

​Special Selectors

​Visible Elements Only

​Detached Elements

​Selector Best Practices

​For AI Agents

​For Manual Scripting

​For Testing

​Performance Comparison

​Debugging Selectors

​Highlight Elements

​Count Matches

​Snapshot Preview

​Advanced Techniques

​Shadow DOM

​Iframes

​Dynamic Content

​Next Steps

Build docs developers (and LLMs) love

Overview

Refs (Recommended for AI)

Overview

Why Use Refs?

Ref Formats

How Refs Resolve

Ref Scoping

CSS Selectors

Basic CSS

When to Use CSS

Performance

Semantic Locators

Overview

Role Locators

Text Locators

Label Locators

Placeholder Locators

Alt Text Locators

Title Locators

Test ID Locators

Positional Locators

Actions

When to Use Semantic Locators

Text Selectors

Exact Text Match

Substring Match

Case-Insensitive

XPath Selectors

Basic XPath

When to Use XPath

Selector Precedence

Selector Composition

CSS + Pseudo-Selectors

Chaining Find Commands

Special Selectors

Visible Elements Only

Detached Elements

Selector Best Practices

For AI Agents

For Manual Scripting

For Testing

Performance Comparison

Debugging Selectors

Highlight Elements

Count Matches

Snapshot Preview

Advanced Techniques

Shadow DOM

Iframes

Dynamic Content

Next Steps