Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/steerlabs/opensteer/llms.txt

Use this file to discover all available pages before exploring further.

Overview

OpenSteer is designed for AI agents to create browser automation scripts that are replayable and maintainable. This guide covers the workflow, best practices, and integration patterns for AI-driven automation.

Why OpenSteer for AI Agents?

Deterministic Replay

Selector caching ensures scripts work consistently across runs

Description-Based Actions

Natural language element targeting instead of brittle CSS selectors

Automatic Waiting

No manual waits - OpenSteer handles element readiness

Structured Extraction

AI-powered data extraction with typed schemas
For AI agents to create maintainable automation scripts:
1

Use OpenSteer APIs instead of raw Playwright

Always prefer OpenSteer methods over direct Playwright calls:
await opensteer.click({ description: 'submit button' })
await opensteer.input({ description: 'email field', text: 'user@example.com' })
await opensteer.extract({ description: 'product listing', schema })
Why: OpenSteer methods cache selectors for replay and handle waiting automatically.
2

Keep namespace consistent

The SDK name parameter must match the CLI --name flag:
# CLI exploration
opensteer open https://example.com --name product-scraper
// SDK script
const opensteer = new Opensteer({ name: 'product-scraper' })
Why: This ensures selector cache is shared between exploration and script execution.
3

Take snapshots before actions and extraction

Always snapshot before interacting with elements:
// Before actions
await opensteer.snapshot({ mode: 'action' })
await opensteer.click({ description: 'login button' })

// Before extraction
await opensteer.snapshot({ mode: 'extraction' })
const data = await opensteer.extract({ description: 'products', schema })
Why: Snapshots provide element counters for targeting and help AI identify elements.
4

Prefer description targeting

Use descriptions for all actions you want to replay:
// Good - replayable
await opensteer.click({ description: 'search button' })

// Acceptable for exploration
await opensteer.click({ element: 5, description: 'search button' })

// Avoid - not replayable
await opensteer.click({ element: 5 })
Why: Descriptions enable selector caching for deterministic replay.
5

Always wrap in try/finally

Ensure resources are cleaned up:
const opensteer = new Opensteer({ name: 'my-scraper' })

try {
  await opensteer.launch()
  // ... automation steps
} finally {
  await opensteer.close()
}
Why: Prevents browser processes from leaking.

Critical Rules for AI Agents

NEVER use raw Playwright when OpenSteer methods existOpenSteer provides:
  • opensteer.click() instead of page.click()
  • opensteer.input() instead of page.fill()
  • opensteer.extract() instead of page.evaluate() + querySelectorAll
  • opensteer.snapshot() instead of page.content()
Raw Playwright bypasses caching and makes scripts brittle.
The only valid use of page.evaluate() is calling fetch() for API-based extraction when a site has internal REST/GraphQL endpoints.

Element Targeting Preferences

Use counters from snapshots during exploration:
await opensteer.snapshot({ mode: 'action' })
await opensteer.click({ element: 5, description: 'submit button' })
  • Fast during development
  • Combine with description for caching
  • Not replayable without description
Use explicit selectors only when necessary:
await opensteer.click({ selector: '#unique-id' })
  • Brittle - breaks on DOM changes
  • No caching benefits
  • Use only for stable, unique IDs

Script Template

AI agents should generate scripts following this pattern:
import { Opensteer } from 'opensteer'

async function run() {
  const opensteer = new Opensteer({
    name: 'script-name',  // Match CLI --name
    storage: { rootDir: process.cwd() },
  })

  await opensteer.launch({ headless: false })

  try {
    await opensteer.goto('https://example.com')

    // Take snapshot before actions
    await opensteer.snapshot({ mode: 'action' })

    // Perform actions with descriptions
    await opensteer.input({
      text: 'search query',
      description: 'search input',
    })

    await opensteer.click({ description: 'search button' })

    // Take snapshot before extraction
    await opensteer.snapshot({ mode: 'extraction' })

    // Extract with description and schema
    const data = await opensteer.extract({
      description: 'search results',
      schema: {
        results: [{ title: '', url: '' }]
      },
    })

    console.log(JSON.stringify(data, null, 2))
  } finally {
    await opensteer.close()
  }
}

run().catch((err) => {
  console.error(err)
  process.exit(1)
})
No top-level await - Always wrap in async function run() + run().catch(...)

Common Patterns

Form Filling

// Fill multiple fields
await opensteer.input({ description: 'first name', text: 'Ada' })
await opensteer.input({ description: 'last name', text: 'Lovelace' })
await opensteer.input({ description: 'email', text: 'ada@example.com' })

// Select dropdown
await opensteer.select({ description: 'country', label: 'United States' })

// Submit
await opensteer.click({ description: 'submit button' })

Multi-Page Navigation

// List page
await opensteer.goto('https://example.com/products')
await opensteer.snapshot({ mode: 'extraction' })
const products = await opensteer.extract({
  description: 'product listing',
  schema: { items: [{ title: '', url: '' }] }
})

// Detail page
await opensteer.click({ description: 'first product link' })
await opensteer.snapshot({ mode: 'extraction' })
const details = await opensteer.extract({
  description: 'product details',
  schema: { description: '', price: '', specs: [''] }
})

Handling Dynamic Content

// Wait for SPA content
await opensteer.waitForText('Results loaded')

// Then snapshot and extract
await opensteer.snapshot({ mode: 'extraction' })
const data = await opensteer.extract({ description: 'results', schema })

Tab Management

// Open in new tab
await opensteer.newTab()
await opensteer.goto('https://example.com')

// Switch between tabs
await opensteer.switchTab(0)

// Close tab
await opensteer.closeTab()

Skills Installation

OpenSteer provides skills for AI coding assistants.

CLI Installation

Install the OpenSteer skill pack globally:
npm i -g opensteer
opensteer skills install
This installs skills to ~/.config/opencode/skills/ for AI assistants to use.

Claude Code Marketplace Plugin

For Claude Code, install via the marketplace:
/plugin marketplace add steerlabs/opensteer
/plugin install opensteer@opensteer-marketplace
This gives Claude Code access to OpenSteer documentation and workflows.

Available Skills

opensteer

Browser automation, web scraping, and structured data extraction

electron

Electron app automation and testing
See the Skills guide for detailed usage.

Debugging Agent-Generated Scripts

When a script fails:
1

Check namespace consistency

Verify name matches between CLI and SDK:
# CLI
opensteer open https://example.com --name my-scraper
// SDK
const opensteer = new Opensteer({ name: 'my-scraper' })
2

Verify selectors are cached

Check .opensteer/selectors/<namespace>/ directory for cached selectors.
3

Add timing waits for SPAs

await opensteer.waitForText('Content loaded')
await opensteer.snapshot({ mode: 'extraction' })
4

Remove obstacles

Handle cookie banners, modals, or login walls:
await opensteer.click({ description: 'close cookie banner' })

Best Practices Summary

// Good
await opensteer.click({ description: 'button' })
await opensteer.extract({ description: 'data', schema })
// Bad
await page.click('.button')
await page.evaluate(() => document.querySelector('.data').textContent)
// Good - replayable
await opensteer.click({ description: 'submit button' })
// Bad - not replayable
await opensteer.click({ element: 5 })
// Good
await opensteer.snapshot({ mode: 'action' })
await opensteer.click({ description: 'button' })
// Bad - actions handle waiting automatically
await page.waitForTimeout(1000)
await opensteer.click({ description: 'button' })

Next Steps

Skills Guide

Install and use OpenSteer skills with AI assistants

CUA Agent

Use Computer Use Agents for natural language automation

Browser Automation

Learn core automation features and patterns

Data Extraction

Extract structured data with typed schemas

Build docs developers (and LLMs) love