Visual labels

Overview

Playwriter’s visual labels provide a Vimium-style overlay system that assigns short alphanumeric codes to interactive elements on the page. This makes it easy for AI agents (and humans) to reference specific elements without complex selectors.

How visual labels work

When you call screenshotWithAccessibilityLabels(), Playwriter:

Analyzes the page’s accessibility tree
Identifies all interactive elements (buttons, links, inputs, etc.)
Generates short, unique labels (e.g., “e5”, “e12”, “e23”)
Overlays these labels on a screenshot
Returns both the labeled screenshot and a text snapshot with aria-ref selectors

Taking a labeled screenshot

// Navigate to page
state.page = context.pages().find((p) => p.url() === 'about:blank') ?? (await context.newPage())
await state.page.goto('https://example.com', { waitUntil: 'domcontentloaded' })

// Get screenshot with labels
const result = await screenshotWithAccessibilityLabels({ page: state.page })

// Returns:
// - screenshot (Buffer): PNG image with labels overlaid
// - snapshot (string): Text representation with aria-ref selectors

The function returns both a visual screenshot with labels AND a text snapshot containing the same aria-ref identifiers.

Using aria-ref selectors

Once you have labels, use them with the aria-ref selector:

// Take labeled screenshot to see labels
await screenshotWithAccessibilityLabels({ page: state.page })
// Returns: ... button "Sign In" [aria-ref=e5] ...

// Click the element with label e5
await state.page.locator('aria-ref=e5').click()

Color coding system

Labels are color-coded by element type to provide visual context:

Color	Element Type	Example
Yellow	Links	Navigation links, anchor tags
Orange	Buttons	Submit buttons, action buttons
Coral	Text inputs	Email, password, search fields
Pink	Checkboxes	Selection boxes, toggles
Peach	Sliders	Range inputs, volume controls
Salmon	Menus	Dropdowns, context menus
Amber	Tabs	Tab navigation, tab panels

Complete workflow example

Navigate to page

state.page = context.pages().find((p) => p.url() === 'about:blank') ?? (await context.newPage())
await state.page.goto('https://github.com/login', { waitUntil: 'domcontentloaded' })

Take labeled screenshot

const result = await screenshotWithAccessibilityLabels({ page: state.page })
// Visual screenshot shows:
// - "Username" input with coral label [e5]
// - "Password" input with coral label [e7]
// - "Sign in" button with orange label [e12]

Use labels to interact

// Fill username (coral label e5)
await state.page.locator('aria-ref=e5').fill('myusername')

// Fill password (coral label e7)
await state.page.locator('aria-ref=e7').fill('mypassword')

// Click sign in (orange label e12)
await state.page.locator('aria-ref=e12').click()

Verify result

await state.page.waitForLoadState('domcontentloaded')
console.log('Current URL:', state.page.url())
await snapshot({ page: state.page })

When to use visual labels

Use visual labels when:

Visual layout matters — need to understand spatial relationships
Complex UIs — many similar elements that are hard to distinguish by text alone
Debugging interactions — want to verify you’re clicking the right element
Visual verification needed — need to confirm button colors, positions, visibility

// Good use case: distinguishing between multiple buttons
await screenshotWithAccessibilityLabels({ page: state.page })
// Shows three "Delete" buttons with different positions and labels
// Can now click the specific one by its label
await state.page.locator('aria-ref=e15').click()

Use text snapshots when:

Reading content — extracting text, checking if elements exist
Fast feedback — no need to analyze images
Token efficiency — text snapshots are 5-20KB vs screenshots at 100KB+
Element finding — searching for specific text or roles

// Good use case: checking if login was successful
await snapshot({ page: state.page, search: /welcome|dashboard/i })
// Fast, cheap, no image tokens needed

Always try snapshot() first before using screenshotWithAccessibilityLabels(). Screenshots are more expensive in both time and tokens.

Programmatic label management

You can also manage labels programmatically without screenshots:

Show labels on page

// Add labels to page (without taking screenshot)
await showAriaRefLabels({ page: state.page })
// Labels now visible in the browser
// Useful for debugging while watching the browser

Hide labels

// Remove labels from page
await hideAriaRefLabels({ page: state.page })

Get snapshot with aria-ref

Get just the text snapshot with aria-ref selectors (no screenshot):

const result = await getAriaSnapshot(state.page)
console.log(result.snapshot) // Text with aria-ref selectors
// Find element by parsing the snapshot text

Advanced usage

Scoped screenshots

Take a labeled screenshot of a specific region:

// Take snapshot of a specific area
const modal = state.page.locator('.modal-dialog')
const result = await screenshotWithAccessibilityLabels({ 
  page: state.page,
  locator: modal 
})
// Shows only labels within the modal

Labels in iframes

Visual labels work in iframes too:

const frame = state.page.frames().find((f) => f.url().includes('widget.example.com'))

if (frame) {
  // Get labeled screenshot of iframe content
  const result = await screenshotWithAccessibilityLabels({ 
    page: state.page,
    frame 
  })
  
  // Use aria-ref within the frame
  await frame.locator('aria-ref=e8').click()
}

Combining with search

Get a labeled screenshot focused on specific elements:

// Take screenshot and also get filtered snapshot
const result = await screenshotWithAccessibilityLabels({ page: state.page })

// Also get text snapshot with search filter
const filtered = await snapshot({ 
  page: state.page, 
  search: /submit|send|post/i 
})
// Shows only elements matching the search pattern

Label lifecycle

Labels are stable within a single page state but change when:

Page navigation — labels reset on new page load
DOM changes — dynamic content may get new labels
Re-taking screenshot — labels may be reassigned

Always take a fresh labeled screenshot before clicking if the page has changed. Stale labels may point to wrong elements.

// ❌ Bad: using old labels after page change
await screenshotWithAccessibilityLabels({ page: state.page })
// ... page content changes ...
await state.page.locator('aria-ref=e5').click() // May click wrong element!

// ✅ Good: fresh screenshot after changes
await screenshotWithAccessibilityLabels({ page: state.page })
await state.page.locator('aria-ref=e5').click()
// ... page changes ...
await screenshotWithAccessibilityLabels({ page: state.page }) // Fresh labels
await state.page.locator('aria-ref=e12').click() // Correct element

Common use cases

Finding specific button among many

// Page has multiple "Delete" buttons
await screenshotWithAccessibilityLabels({ page: state.page })
// Screenshot shows:
// - "Delete" [e5] - next to item 1
// - "Delete" [e8] - next to item 2  
// - "Delete" [e11] - next to item 3

// Click the specific one
await state.page.locator('aria-ref=e8').click()

Verifying element visibility

// Check if sidebar toggle is visible
await screenshotWithAccessibilityLabels({ page: state.page })
// Look at screenshot to verify button is visible and accessible
// If label appears, element is visible and interactive

// Large form with many inputs
await screenshotWithAccessibilityLabels({ page: state.page })
// Screenshot shows all fields with labels
// Fill form using labels:
await state.page.locator('aria-ref=e5').fill('John')
await state.page.locator('aria-ref=e7').fill('Doe')
await state.page.locator('aria-ref=e9').fill('john@example.com')
await state.page.locator('aria-ref=e15').click() // Submit

Best practices

Try text snapshot first — cheaper and faster than screenshots
Use for visual verification — when layout and positioning matter
Retake after page changes — don’t reuse stale labels
Color coding helps — use colors to identify element types quickly
Combine with text snapshots — use screenshot for layout, text for content
Clean up afterward — hide labels with hideAriaRefLabels() if shown programmatically
Handle label not found — elements may appear/disappear dynamically

Get Started

Core Concepts

CLI Usage

MCP Server

Advanced Features

Guides

Overview

How visual labels work

Taking a labeled screenshot

Using aria-ref selectors

Color coding system

Complete workflow example

When to use visual labels

Use visual labels when:

Use text snapshots when:

Programmatic label management

Show labels on page

Hide labels

Get snapshot with aria-ref

Advanced usage

Scoped screenshots

Labels in iframes

Combining with search

Label lifecycle

Common use cases

Finding specific button among many

Verifying element visibility

Complex form navigation

Best practices

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Usage

MCP Server

Advanced Features

Guides

Documentation Index

​Overview

​How visual labels work

​Taking a labeled screenshot

​Using aria-ref selectors

​Color coding system

​Complete workflow example

​When to use visual labels

​Use visual labels when:

​Use text snapshots when:

​Programmatic label management

​Show labels on page

​Hide labels

​Get snapshot with aria-ref

​Advanced usage

​Scoped screenshots

​Labels in iframes

​Combining with search

​Label lifecycle

​Common use cases

​Finding specific button among many

​Verifying element visibility

​Complex form navigation

​Best practices

Build docs developers (and LLMs) love

Overview

How visual labels work

Taking a labeled screenshot

Using aria-ref selectors

Color coding system

Complete workflow example

When to use visual labels

Use visual labels when:

Use text snapshots when:

Programmatic label management

Show labels on page

Hide labels

Get snapshot with aria-ref

Advanced usage

Scoped screenshots

Labels in iframes

Combining with search

Label lifecycle

Common use cases

Finding specific button among many

Verifying element visibility

Complex form navigation

Best practices