Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/steerlabs/opensteer/llms.txt

Use this file to discover all available pages before exploring further.

Method Signature

browser.agent(config: OpensteerAgentConfig): OpensteerAgentInstance
Create an agent instance for autonomous browser control using AI models with computer use capabilities.

Configuration

mode
OpensteerAgentMode
required
Agent mode. Currently only 'cua' (Computer Use Agent) is supported.
model
string | OpensteerAgentModelConfig
Model configuration. Can be a model string in provider/model format or a detailed configuration object.String format: 'provider/model'Examples:
  • 'openai/computer-use-preview'
  • 'anthropic/claude-3-5-sonnet-20241022'
  • 'google/gemini-2.0-flash-exp'
Object format: See Model Configuration below.
systemPrompt
string
Custom system prompt to guide agent behavior. Defaults to:
You are a browser automation agent. Complete the user instruction 
safely and efficiently. Do not ask follow-up questions. Finish 
as soon as the task is complete.
waitBetweenActionsMs
number
default:500
Milliseconds to wait between agent actions. Must be a non-negative number. Set to 0 for no delay.

Model Configuration

When using the object format for model, provide an OpensteerAgentModelConfig:
modelName
string
required
Model name in provider/model format. Must include a slash separator.Examples:
  • 'openai/computer-use-preview'
  • 'anthropic/claude-3-5-sonnet-20241022'
  • 'google/gemini-2.0-flash-exp'
apiKey
string
API key for the provider. If not provided, reads from environment variables:
  • OpenAI: OPENAI_API_KEY
  • Anthropic: ANTHROPIC_API_KEY
  • Google: GOOGLE_GENERATIVE_AI_API_KEY, GEMINI_API_KEY, or GOOGLE_API_KEY
baseUrl
string
Custom base URL for the provider API. Useful for proxies or compatible services.
organization
string
Organization ID for OpenAI (if applicable).
thinkingBudget
number
Thinking budget for OpenAI extended reasoning models. Controls the amount of reasoning tokens allocated.
environment
string
Environment identifier for Google models (provider-specific configuration).

Return Value

Returns an OpensteerAgentInstance with:
execute
function
required
Execute method for running agent tasks. See execute() for details.
execute(instructionOrOptions: string | OpensteerAgentExecuteOptions): Promise<OpensteerAgentResult>

Examples

Simple String Model

import { Opensteer } from 'opensteer'

const browser = new Opensteer()
await browser.launch()

const agent = browser.agent({
  mode: 'cua',
  model: 'openai/computer-use-preview'
})

const result = await agent.execute('Navigate to the pricing page')
await browser.close()

OpenAI with Custom Configuration

const agent = browser.agent({
  mode: 'cua',
  model: {
    modelName: 'openai/computer-use-preview',
    apiKey: process.env.OPENAI_API_KEY,
    organization: 'org-123456',
    thinkingBudget: 10000
  },
  systemPrompt: 'You are a precise automation agent. Always verify actions before executing.',
  waitBetweenActionsMs: 1000
})

Anthropic Configuration

const agent = browser.agent({
  mode: 'cua',
  model: {
    modelName: 'anthropic/claude-3-5-sonnet-20241022',
    apiKey: process.env.ANTHROPIC_API_KEY
  },
  systemPrompt: 'Complete tasks efficiently with minimal steps.',
  waitBetweenActionsMs: 500
})

const result = await agent.execute('Search for documentation')
console.log('Provider:', result.provider) // 'anthropic'
console.log('Model:', result.model) // 'anthropic/claude-3-5-sonnet-20241022'

Google Gemini Configuration

const agent = browser.agent({
  mode: 'cua',
  model: {
    modelName: 'google/gemini-2.0-flash-exp',
    apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY,
    environment: 'production'
  },
  waitBetweenActionsMs: 300
})

Custom Base URL (Proxy or Compatible Service)

const agent = browser.agent({
  mode: 'cua',
  model: {
    modelName: 'openai/computer-use-preview',
    apiKey: process.env.CUSTOM_API_KEY,
    baseUrl: 'https://custom-proxy.example.com/v1'
  }
})

Using Environment Variables Only

// Set environment variables:
// export OPENAI_API_KEY=sk-...

const agent = browser.agent({
  mode: 'cua',
  model: 'openai/computer-use-preview'
  // API key read automatically from OPENAI_API_KEY
})

No Wait Between Actions

const agent = browser.agent({
  mode: 'cua',
  model: 'anthropic/claude-3-5-sonnet-20241022',
  waitBetweenActionsMs: 0 // Execute actions as fast as possible
})

Provider-Specific Notes

OpenAI

  • Requires OPENAI_API_KEY environment variable or apiKey in config
  • Supports organization for team/org accounts
  • Extended reasoning models support thinkingBudget parameter
  • Model example: 'openai/computer-use-preview'

Anthropic

  • Requires ANTHROPIC_API_KEY environment variable or apiKey in config
  • Claude models with computer use capabilities
  • Model example: 'anthropic/claude-3-5-sonnet-20241022'

Google

  • Requires GOOGLE_GENERATIVE_AI_API_KEY, GEMINI_API_KEY, or GOOGLE_API_KEY
  • Supports environment parameter for environment-specific configuration
  • Gemini models with multimodal capabilities
  • Model example: 'google/gemini-2.0-flash-exp'

Error Handling

Configuration errors throw specific error types:
try {
  const agent = browser.agent({
    mode: 'cua',
    model: 'invalid-provider/model'
  })
} catch (error) {
  if (error.name === 'OpensteerAgentProviderError') {
    console.error('Unsupported provider')
  } else if (error.name === 'OpensteerAgentConfigError') {
    console.error('Configuration error:', error.message)
  }
}

Notes

  • Only one agent execution can run at a time per agent instance
  • Agent instances are tied to the browser instance they’re created from
  • Model format must always be provider/model with a forward slash
  • API keys are required either via config or environment variables
  • The system prompt significantly affects agent behavior and decision-making
  • Higher waitBetweenActionsMs values improve reliability but slow execution

Build docs developers (and LLMs) love