Accessibility Snapshots

Playwriter uses accessibility snapshots to give AI agents a text-based view of web pages. Instead of sending screenshots (100KB+ images), snapshots provide a compact, searchable tree of interactive elements with ready-to-use locators.

What is an Accessibility Snapshot?

An accessibility snapshot is a text tree representation of the browser’s accessibility tree (the same data screen readers use). It shows:

Semantic roles (button, link, textbox, heading, etc.)
Accessible names (button text, link text, input labels)
Playwright locators you can use immediately
Element refs for visual label lookup

Example output:

- banner:
  - link "Home" [id="nav-home"]
  - navigation:
    - link "Docs" [data-testid="docs-link"]
    - link "Blog" role=link[name="Blog"]
- main:
  - heading "Welcome" role=heading[name="Welcome"]
  - button "Get Started" [data-testid="cta-button"]

Each interactive line ends with a Playwright locator you can pass directly to page.locator().

Why Use Snapshots Instead of Screenshots?

Token efficiency: Snapshots are 5-20KB of text. Screenshots are 100KB+ images. For simple text-heavy pages, snapshots save 95% of tokens.

When to use snapshots:

Page has simple, semantic HTML structure
You need to search for specific text or patterns
You want to process the output programmatically (filter, map, search)
Token usage matters (always prefer text over images when possible)

When to use screenshots with labels:

Page has complex visual layout (grids, galleries, maps, dashboards)
Spatial position matters (“first image”, “top-left button”)
DOM order doesn’t match visual order
You need to understand visual hierarchy

See choosing between snapshot methods for details.

How It Works

Source: playwriter/src/aria-snapshot.ts

1. Fetch Accessibility Tree via CDP

Playwriter uses Chrome DevTools Protocol to get the full accessibility tree:

const { nodes: axNodes } = await session.send(
  'Accessibility.getFullAXTree',
  frameId ? { frameId } : undefined,
  oopifSessionId,
)

The accessibility tree includes:

Role (button, link, heading, etc.)
Accessible name (computed from text content, aria-label, title, etc.)
Backend DOM node ID (for mapping to HTML elements)

2. Map to DOM Attributes

To generate stable locators, Playwriter fetches the full DOM tree and maps AX nodes to DOM attributes:

const { nodes: domNodes } = await session.send(
  'DOM.getFlattenedDocument',
  { depth: -1, pierce: true },
  oopifSessionId,
)

const domByBackendId = new Map<Protocol.DOM.BackendNodeId, DomNodeInfo>()
for (const node of domNodes) {
  const info: DomNodeInfo = {
    nodeId: node.nodeId,
    backendNodeId: node.backendNodeId,
    nodeName: node.nodeName,
    attributes: toAttributeMap(node.attributes),
  }
  domByBackendId.set(node.backendNodeId, info)
}

3. Filter Interactive Elements

The raw AX tree includes every DOM node. Playwriter filters to show only:

Interactive elements: buttons, links, inputs, checkboxes, sliders, etc.
Context elements: navigation, main, form, list, table (for structure)
Labels: Text that labels interactive elements (for clarity)

const INTERACTIVE_ROLES = new Set([
  'button', 'link', 'textbox', 'combobox', 'checkbox', 'radio',
  'slider', 'switch', 'menuitem', 'tab', 'img', 'video', 'audio',
])

const CONTEXT_ROLES = new Set([
  'navigation', 'main', 'contentinfo', 'banner', 'form',
  'section', 'list', 'table', 'row', 'cell',
])

Wrapper hoisting: Empty <div> and <span> wrappers (role=generic) are collapsed to reduce noise.

4. Generate Locators

For each interactive element, Playwriter generates a stable locator: Priority order:

Test IDs (most stable): [data-testid="submit"], [data-test-id="login"], etc.
HTML IDs: [id="nav-home"]
Role + name: role=button[name="Submit"]
Role only: role=button (if no accessible name)

function buildBaseLocator({
  role,
  name,
  stable,
}: {
  role: string
  name: string
  stable: { value: string; attr: string } | null
}): string {
  if (stable) {
    return `[${stable.attr}="${escapeLocatorValue(stable.value)}"]`
  }
  const trimmedName = name.trim()
  if (trimmedName.length > 0) {
    return `role=${role}[name="${escapeLocatorValue(trimmedName)}"]`
  }
  return `role=${role}`
}

Deduplication: If multiple elements share the same locator, Playwright’s >> nth=N is appended:

- button "Delete" role=button[name="Delete"] >> nth=0
- button "Delete" role=button[name="Delete"] >> nth=1

5. Generate Refs for Visual Labels

Refs are short identifiers (e1, e2, …) used in visual labels. They’re generated from:

Stable test IDs (preferred): submit-btn, nav-home
Fallback counter: e1, e2, e3 (when no test ID exists)

const createRefForNode = (options: {
  backendNodeId?: Protocol.DOM.BackendNodeId
  role: string
  name: string
}): string | null => {
  const domInfo = options.backendNodeId ? domByBackendId.get(options.backendNodeId) : undefined
  const stable = domInfo ? getStableRefFromAttributes(domInfo.attributes) : null
  let baseRef = stable?.value || `e${++fallbackCounter}`
  
  const count = refCounts.get(baseRef) ?? 0
  refCounts.set(baseRef, count + 1)
  const ref = count === 0 ? baseRef : `${baseRef}-${count + 1}`
  
  refs.push({ ref, role: options.role, name: options.name })
  return ref
}

Visual Labels (Vimium-Style)

For screenshots, Playwriter overlays visual labels on interactive elements:

await screenshotWithAccessibilityLabels({ page })

This:

Calls showAriaRefLabels() to render colored badges with refs (e.g., [e3], [submit-btn])
Takes a screenshot with labels visible
Calls hideAriaRefLabels() to remove badges
Returns screenshot + accessibility snapshot

Label colors (role-based):

Yellow: Links
Orange: Buttons
Coral: Inputs
Pink: Checkboxes
Peach: Sliders
Salmon: Menus
Amber: Tabs

Implementation: Labels are positioned using CDP DOM.getBoxModel to get element bounding boxes:

const { model } = await session.send('DOM.getBoxModel', {
  backendNodeId: ref.backendNodeId,
})
const box = buildBoxFromQuad(model.border)
// box = { x, y, width, height }

Labels are rendered in-page using page.evaluate() with a pre-built client script (a11y-client.js).

Using Snapshot Locators

Rule: Use snapshot locators directly - never invent selectors. The snapshot output is the selector. Copy it verbatim into page.locator():

// Snapshot shows: [data-testid="submit-btn"]
await page.locator('[data-testid="submit-btn"]').click()

// Snapshot shows: role=link[name="SIGN IN"]
await page.locator('role=link[name="SIGN IN"]').click()

// Snapshot shows: role=button[name="Delete"] >> nth=1
await page.locator('role=button[name="Delete"] >> nth=1').click()

Common mistake: Guessing CSS selectors or getByText() when the snapshot already gives you the exact match.

Scoping Snapshots

Pass a locator to snapshot only a subtree:

// Full page snapshot: ~150 lines (sidebar, nav, header, footer, everything)
await snapshot({ page })

// Scoped to main: ~20 lines (just the content you care about)
await snapshot({ locator: page.locator('main') })

// Scope to a dialog
await snapshot({ locator: page.locator('[role="dialog"]') })

When to scope:

Full page snapshot is dominated by navigation/layout you don’t need
You only care about one section (modal, form, sidebar, etc.)
Snapshot is too large and you want to reduce token usage

Filtering Snapshots

Use search parameter to filter by regex:

// Show only buttons and submit elements
const snap = await snapshot({ page, search: /button|submit/i })

// Find error messages
const errors = await snapshot({ page, search: /error|fail/i })

// Find dialog or modal
const modal = await snapshot({ page, search: /dialog|modal/i })

Returns first 10 matching lines with context. For complex filtering, filter the snapshot string directly in JavaScript:

const snap = await snapshot({ page, showDiffSinceLastCall: false })
const relevant = snap.split('\n')
  .filter(l => l.includes('dialog') || l.includes('error') || l.includes('button'))
  .join('\n')
console.log(relevant)

Diffing (Incremental Snapshots)

Snapshots track changes since last call to reduce output:

First call: Returns full snapshot
Subsequent calls: Returns diff (if shorter than full snapshot)
No changes: Returns “No changes since last snapshot”

Disable diffing:

await snapshot({ page, showDiffSinceLastCall: false })  // Always full output

Diffing with search:

// By default, search returns full matches (diffing disabled)
await snapshot({ page, search: /button/ })

// Enable diffing + search together
await snapshot({ page, search: /button/, showDiffSinceLastCall: true })

iframe Support

Snapshots work with iframes (including out-of-process iframes / OOPIFs):

// Snapshot the main page
const mainSnap = await snapshot({ page })

// Snapshot a specific iframe
const frame = await page.locator('iframe').contentFrame()
const frameSnap = await snapshot({ frame })

How it works:

Resolve FrameLocator to actual Frame object
Check if iframe is cross-origin (OOPIF)
If OOPIF, attach CDP session to iframe target
Fetch AX tree and DOM tree from iframe session
Detach CDP session when done

See resolveFrame() in aria-snapshot.ts for implementation.

API Reference

snapshot()

await snapshot({
  page: Page,
  frame?: Frame | FrameLocator,
  locator?: Locator,
  search?: string | RegExp,
  showDiffSinceLastCall?: boolean,  // default: true (false when search provided)
})

Returns: Accessibility snapshot as a string.

screenshotWithAccessibilityLabels()

await screenshotWithAccessibilityLabels({
  page: Page,
  locator?: Locator,
  interactiveOnly?: boolean,  // default: true
  collector: ScreenshotResult[],  // MCP tool injects this
})

Returns: void (screenshot + snapshot are added to collector array)

getAriaSnapshot()

Low-level API that returns snapshot + utilities:

const {
  snapshot,           // String tree
  tree,               // AriaSnapshotNode[] for programmatic use
  refs,               // AriaRef[] with role, name, ref, shortRef
  getSelectorForRef,  // (ref: string) => string | null
  getRefForLocator,   // (locator) => Promise<AriaRef | null>
} = await getAriaSnapshot({ page })

const selector = getSelectorForRef('submit-btn')
await page.locator(selector).click()

Performance

Snapshot generation time:

Simple page (10-20 elements): ~50ms
Complex page (100+ elements): ~200ms
With visual labels (CDP box model): +500-1000ms (parallelized with concurrency: 24)

Optimization tips:

Use locator to scope snapshots to subtrees
Use search to filter results server-side
Call snapshot() before screenshot() to avoid double CDP round-trips

Architecture - How CDP is used for browser control
Playwriter Skill Docs - Best practices for using snapshots in AI workflows

Get Started

Core Concepts

CLI Usage

MCP Server

Advanced Features

Guides

Accessibility Snapshots

What is an Accessibility Snapshot?

Why Use Snapshots Instead of Screenshots?

How It Works

1. Fetch Accessibility Tree via CDP

2. Map to DOM Attributes

3. Filter Interactive Elements

4. Generate Locators

5. Generate Refs for Visual Labels

Visual Labels (Vimium-Style)

Using Snapshot Locators

Scoping Snapshots

Filtering Snapshots

Diffing (Incremental Snapshots)

iframe Support

API Reference

snapshot()

screenshotWithAccessibilityLabels()

getAriaSnapshot()

Performance

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Usage

MCP Server

Advanced Features

Guides

Documentation Index

​What is an Accessibility Snapshot?

​Why Use Snapshots Instead of Screenshots?

​How It Works

​1. Fetch Accessibility Tree via CDP

​2. Map to DOM Attributes

​3. Filter Interactive Elements

​4. Generate Locators

​5. Generate Refs for Visual Labels

​Visual Labels (Vimium-Style)

​Using Snapshot Locators

​Scoping Snapshots

​Filtering Snapshots

​Diffing (Incremental Snapshots)

​iframe Support

​API Reference

​snapshot()

​screenshotWithAccessibilityLabels()

​getAriaSnapshot()

​Performance

​Related

Build docs developers (and LLMs) love

What is an Accessibility Snapshot?

Why Use Snapshots Instead of Screenshots?

How It Works

1. Fetch Accessibility Tree via CDP

2. Map to DOM Attributes

3. Filter Interactive Elements

4. Generate Locators

5. Generate Refs for Visual Labels

Visual Labels (Vimium-Style)

Using Snapshot Locators

Scoping Snapshots

Filtering Snapshots

Diffing (Incremental Snapshots)

iframe Support

API Reference

snapshot()

screenshotWithAccessibilityLabels()

getAriaSnapshot()

Performance

Related