Python SDK

Overview

The Python SDK provides a Pythonic interface to Stagehand’s browser automation capabilities. It connects to the Stagehand API server, which runs the TypeScript core engine and exposes functionality via a RESTful HTTP API.

The Python SDK is a separate implementation that communicates with the Stagehand API server. It provides the same core functionality as the TypeScript SDK but through Python-native interfaces.

Architecture

The Python SDK uses a client-server architecture:

API Server: Runs the TypeScript Stagehand engine
Python Client: Sends requests to the API server
Browser Session: Managed by Browserbase or local Chrome

┌─────────────────┐
│  Python Client  │
│   (Your Code)   │
└────────┬────────┘
         │ HTTP/REST
         ▼
┌─────────────────┐
│  API Server     │
│  (TypeScript)   │
└────────┬────────┘
         │ CDP
         ▼
┌─────────────────┐
│  Browser        │
│  (Browserbase)  │
└─────────────────┘

Installation

pip install stagehand

Repository

The Python SDK is maintained in a separate repository:

Python SDK on GitHub

View the Python SDK source code, examples, and documentation

Quick Start

from stagehand import Stagehand
import asyncio

async def main():
    # Initialize Stagehand
    stagehand = Stagehand(
        env="BROWSERBASE",
        api_key="your-browserbase-api-key",
        project_id="your-project-id",
        model={
            "model_name": "openai/gpt-4o",
            "api_key": "your-openai-api-key",
        },
        verbose=1,
    )
    
    await stagehand.init()
    
    try:
        # Navigate to a page
        page = stagehand.page
        await page.goto("https://github.com/browserbase")
        
        # Perform an action
        await stagehand.act("click on the stagehand repo")
        
        # Extract data
        result = await stagehand.extract(
            instruction="extract the repository description",
            schema={
                "description": "string",
                "stars": "number",
            }
        )
        print(result)
        
    finally:
        await stagehand.close()

if __name__ == "__main__":
    asyncio.run(main())

Core Methods

The Python SDK mirrors the TypeScript API:

act()

Execute actions using natural language:

await stagehand.act("click the login button")

await stagehand.act(
    "fill the email field with {{email}}",
    variables={"email": "user@example.com"}
)

extract()

Extract structured data from the page:

result = await stagehand.extract(
    instruction="get user information",
    schema={
        "name": "string",
        "email": "string",
        "age": "number",
    }
)

print(f"Name: {result['name']}")

observe()

Find elements without taking action:

actions = await stagehand.observe(
    "find all product cards"
)

for action in actions:
    print(f"Element: {action['selector']}")
    print(f"Description: {action['description']}")

agent.execute()

Run multi-step autonomous tasks:

agent = stagehand.agent()

result = await agent.execute(
    "Search for Python tutorials and extract the top 3 results"
)

print(result["message"])
for action in result["actions"]:
    print(f"Action: {action['type']}")

Configuration

Initialization Options

stagehand = Stagehand(
    env="BROWSERBASE",  # or "LOCAL"
    api_key="your-api-key",
    project_id="your-project-id",
    model={
        "model_name": "openai/gpt-4o",
        "api_key": "your-model-api-key",
    },
    verbose=1,  # 0, 1, or 2
    system_prompt="Custom instructions for the AI",
    self_heal=True,  # Enable auto-recovery
    cache_dir="./cache",  # Cache directory
)

Environment Variables

You can use environment variables for configuration:

export BROWSERBASE_API_KEY="your-api-key"
export BROWSERBASE_PROJECT_ID="your-project-id"
export OPENAI_API_KEY="your-openai-key"

import os

stagehand = Stagehand(
    env="BROWSERBASE",
    api_key=os.getenv("BROWSERBASE_API_KEY"),
    project_id=os.getenv("BROWSERBASE_PROJECT_ID"),
    model={
        "model_name": "openai/gpt-4o",
        "api_key": os.getenv("OPENAI_API_KEY"),
    },
)

Supported Models

The Python SDK supports the same LLM providers as the TypeScript SDK:

OpenAI: openai/gpt-4o, openai/gpt-4.1-mini, openai/o1
Anthropic: anthropic/claude-3-5-sonnet-latest
Google: google/gemini-2.0-flash
Cerebras: cerebras/llama-3.3-70b
Groq: groq/llama-3.3-70b-versatile

Working with the API Server

Running the API Server

The Python SDK requires the Stagehand API server to be running. There are two options:

Option 1: Hosted API

Use Browserbase’s hosted API (recommended):

stagehand = Stagehand(
    env="BROWSERBASE",
    api_key="your-browserbase-api-key",
    project_id="your-project-id",
    # API endpoint is automatically configured
)

Option 2: Self-Hosted

Run the API server locally:

# Clone the repository
git clone https://github.com/browserbase/stagehand
cd stagehand/packages/server

# Install dependencies
pnpm install

# Start the server
pnpm dev

Then configure your Python client:

stagehand = Stagehand(
    env="LOCAL",
    api_url="http://localhost:3000",  # Your local server
)

Examples

Web Scraping

async def scrape_articles():
    stagehand = Stagehand(env="BROWSERBASE", ...)
    await stagehand.init()
    
    try:
        await stagehand.page.goto("https://news.ycombinator.com")
        
        articles = await stagehand.extract(
            instruction="extract the top 5 articles",
            schema={
                "articles": [
                    {
                        "title": "string",
                        "url": "string",
                        "points": "number",
                    }
                ]
            }
        )
        
        return articles["articles"]
    finally:
        await stagehand.close()

articles = asyncio.run(scrape_articles())
for article in articles:
    print(f"{article['title']} - {article['points']} points")

Form Automation

async def submit_form():
    stagehand = Stagehand(env="BROWSERBASE", ...)
    await stagehand.init()
    
    try:
        await stagehand.page.goto("https://example.com/form")
        
        await stagehand.act(
            "fill the name field with {{name}}",
            variables={"name": "John Doe"}
        )
        
        await stagehand.act(
            "fill the email field with {{email}}",
            variables={"email": "john@example.com"}
        )
        
        await stagehand.act("click the submit button")
        
        # Wait for confirmation
        result = await stagehand.extract(
            instruction="get the success message",
            schema={"message": "string"}
        )
        
        print(result["message"])
    finally:
        await stagehand.close()

asyncio.run(submit_form())

Differences from TypeScript SDK

The Python SDK aims for API parity but has some differences:

Schema Definition

TypeScript uses Zod schemas, Python uses dictionaries:

# Python
schema = {
    "name": "string",
    "age": "number",
    "active": "boolean",
}

// TypeScript
const schema = z.object({
  name: z.string(),
  age: z.number(),
  active: z.boolean(),
});

Async/Await

Python uses asyncio for async operations:

import asyncio

async def main():
    await stagehand.init()
    await stagehand.act("click")

asyncio.run(main())

Naming Conventions

Python uses snake_case instead of camelCase:

# Python
stagehand.act(action, model_name="gpt-4o")

// TypeScript
stagehand.act(action, { modelName: "gpt-4o" })

Type Hints

The Python SDK includes full type hints:

from stagehand import Stagehand, ActResult, ExtractResult
from typing import Dict, Any

async def my_function() -> Dict[str, Any]:
    stagehand: Stagehand = Stagehand(...)
    result: ActResult = await stagehand.act("click")
    data: ExtractResult = await stagehand.extract(...)
    return data

Resources

Python SDK Repository

Source code and examples

TypeScript SDK

Compare with TypeScript API

API Server

Run your own API server

PyPI Package

View on PyPI

Support

For help with the Python SDK:

SDKs

Overview

Architecture

Installation

Repository

Python SDK on GitHub

Quick Start

Core Methods

act()

extract()

observe()

agent.execute()

Configuration

Initialization Options

Environment Variables

Supported Models

Working with the API Server

Running the API Server

Option 1: Hosted API

Option 2: Self-Hosted

Examples

Web Scraping

Form Automation

Differences from TypeScript SDK

Type Hints

Resources

Python SDK Repository

TypeScript SDK

API Server

PyPI Package

Support

Build docs developers (and LLMs) love

SDKs

Documentation Index

​Overview

​Architecture

​Installation

​Repository

Python SDK on GitHub

​Quick Start

​Core Methods

​act()

​extract()

​observe()

​agent.execute()

​Configuration

​Initialization Options

​Environment Variables

​Supported Models

​Working with the API Server

​Running the API Server

​Option 1: Hosted API

​Option 2: Self-Hosted

​Examples

​Web Scraping

​Form Automation

​Differences from TypeScript SDK

​Type Hints

​Resources

Python SDK Repository

TypeScript SDK

API Server

PyPI Package

​Support

Build docs developers (and LLMs) love

Overview

Architecture

Installation

Repository

Quick Start

Core Methods

act()

extract()

observe()

agent.execute()

Configuration

Initialization Options

Environment Variables

Supported Models

Working with the API Server

Running the API Server

Option 1: Hosted API

Option 2: Self-Hosted

Examples

Web Scraping

Form Automation

Differences from TypeScript SDK

Type Hints

Resources

Support