Documentation Index Fetch the complete documentation index at: https://mintlify.com/browserbase/stagehand/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Python SDK provides a Pythonic interface to Stagehand’s browser automation capabilities. It connects to the Stagehand API server, which runs the TypeScript core engine and exposes functionality via a RESTful HTTP API.
The Python SDK is a separate implementation that communicates with the Stagehand API server. It provides the same core functionality as the TypeScript SDK but through Python-native interfaces.
Architecture
The Python SDK uses a client-server architecture:
API Server : Runs the TypeScript Stagehand engine
Python Client : Sends requests to the API server
Browser Session : Managed by Browserbase or local Chrome
┌─────────────────┐
│ Python Client │
│ (Your Code) │
└────────┬────────┘
│ HTTP/REST
▼
┌─────────────────┐
│ API Server │
│ (TypeScript) │
└────────┬────────┘
│ CDP
▼
┌─────────────────┐
│ Browser │
│ (Browserbase) │
└─────────────────┘
Installation
Repository
The Python SDK is maintained in a separate repository:
Python SDK on GitHub View the Python SDK source code, examples, and documentation
Quick Start
from stagehand import Stagehand
import asyncio
async def main ():
# Initialize Stagehand
stagehand = Stagehand(
env = "BROWSERBASE" ,
api_key = "your-browserbase-api-key" ,
project_id = "your-project-id" ,
model = {
"model_name" : "openai/gpt-4o" ,
"api_key" : "your-openai-api-key" ,
},
verbose = 1 ,
)
await stagehand.init()
try :
# Navigate to a page
page = stagehand.page
await page.goto( "https://github.com/browserbase" )
# Perform an action
await stagehand.act( "click on the stagehand repo" )
# Extract data
result = await stagehand.extract(
instruction = "extract the repository description" ,
schema = {
"description" : "string" ,
"stars" : "number" ,
}
)
print (result)
finally :
await stagehand.close()
if __name__ == "__main__" :
asyncio.run(main())
Core Methods
The Python SDK mirrors the TypeScript API:
act()
Execute actions using natural language:
await stagehand.act( "click the login button" )
await stagehand.act(
"fill the email field with {{ email }} " ,
variables = { "email" : "user@example.com" }
)
Extract structured data from the page:
result = await stagehand.extract(
instruction = "get user information" ,
schema = {
"name" : "string" ,
"email" : "string" ,
"age" : "number" ,
}
)
print ( f "Name: { result[ 'name' ] } " )
observe()
Find elements without taking action:
actions = await stagehand.observe(
"find all product cards"
)
for action in actions:
print ( f "Element: { action[ 'selector' ] } " )
print ( f "Description: { action[ 'description' ] } " )
agent.execute()
Run multi-step autonomous tasks:
agent = stagehand.agent()
result = await agent.execute(
"Search for Python tutorials and extract the top 3 results"
)
print (result[ "message" ])
for action in result[ "actions" ]:
print ( f "Action: { action[ 'type' ] } " )
Configuration
Initialization Options
stagehand = Stagehand(
env = "BROWSERBASE" , # or "LOCAL"
api_key = "your-api-key" ,
project_id = "your-project-id" ,
model = {
"model_name" : "openai/gpt-4o" ,
"api_key" : "your-model-api-key" ,
},
verbose = 1 , # 0, 1, or 2
system_prompt = "Custom instructions for the AI" ,
self_heal = True , # Enable auto-recovery
cache_dir = "./cache" , # Cache directory
)
Environment Variables
You can use environment variables for configuration:
export BROWSERBASE_API_KEY = "your-api-key"
export BROWSERBASE_PROJECT_ID = "your-project-id"
export OPENAI_API_KEY = "your-openai-key"
import os
stagehand = Stagehand(
env = "BROWSERBASE" ,
api_key = os.getenv( "BROWSERBASE_API_KEY" ),
project_id = os.getenv( "BROWSERBASE_PROJECT_ID" ),
model = {
"model_name" : "openai/gpt-4o" ,
"api_key" : os.getenv( "OPENAI_API_KEY" ),
},
)
Supported Models
The Python SDK supports the same LLM providers as the TypeScript SDK:
OpenAI : openai/gpt-4o, openai/gpt-4.1-mini, openai/o1
Anthropic : anthropic/claude-3-5-sonnet-latest
Google : google/gemini-2.0-flash
Cerebras : cerebras/llama-3.3-70b
Groq : groq/llama-3.3-70b-versatile
Working with the API Server
Running the API Server
The Python SDK requires the Stagehand API server to be running. There are two options:
Option 1: Hosted API
Use Browserbase’s hosted API (recommended):
stagehand = Stagehand(
env = "BROWSERBASE" ,
api_key = "your-browserbase-api-key" ,
project_id = "your-project-id" ,
# API endpoint is automatically configured
)
Option 2: Self-Hosted
Run the API server locally:
# Clone the repository
git clone https://github.com/browserbase/stagehand
cd stagehand/packages/server
# Install dependencies
pnpm install
# Start the server
pnpm dev
Then configure your Python client:
stagehand = Stagehand(
env = "LOCAL" ,
api_url = "http://localhost:3000" , # Your local server
)
Examples
Web Scraping
async def scrape_articles ():
stagehand = Stagehand( env = "BROWSERBASE" , ... )
await stagehand.init()
try :
await stagehand.page.goto( "https://news.ycombinator.com" )
articles = await stagehand.extract(
instruction = "extract the top 5 articles" ,
schema = {
"articles" : [
{
"title" : "string" ,
"url" : "string" ,
"points" : "number" ,
}
]
}
)
return articles[ "articles" ]
finally :
await stagehand.close()
articles = asyncio.run(scrape_articles())
for article in articles:
print ( f " { article[ 'title' ] } - { article[ 'points' ] } points" )
Form Automation
async def submit_form ():
stagehand = Stagehand( env = "BROWSERBASE" , ... )
await stagehand.init()
try :
await stagehand.page.goto( "https://example.com/form" )
await stagehand.act(
"fill the name field with {{ name }} " ,
variables = { "name" : "John Doe" }
)
await stagehand.act(
"fill the email field with {{ email }} " ,
variables = { "email" : "john@example.com" }
)
await stagehand.act( "click the submit button" )
# Wait for confirmation
result = await stagehand.extract(
instruction = "get the success message" ,
schema = { "message" : "string" }
)
print (result[ "message" ])
finally :
await stagehand.close()
asyncio.run(submit_form())
Differences from TypeScript SDK
The Python SDK aims for API parity but has some differences:
TypeScript uses Zod schemas, Python uses dictionaries: # Python
schema = {
"name" : "string" ,
"age" : "number" ,
"active" : "boolean" ,
}
// TypeScript
const schema = z . object ({
name: z . string (),
age: z . number (),
active: z . boolean (),
});
Python uses asyncio for async operations: import asyncio
async def main ():
await stagehand.init()
await stagehand.act( "click" )
asyncio.run(main())
Python uses snake_case instead of camelCase: # Python
stagehand.act(action, model_name = "gpt-4o" )
// TypeScript
stagehand . act ( action , { modelName: "gpt-4o" })
Type Hints
The Python SDK includes full type hints:
from stagehand import Stagehand, ActResult, ExtractResult
from typing import Dict, Any
async def my_function () -> Dict[ str , Any]:
stagehand: Stagehand = Stagehand( ... )
result: ActResult = await stagehand.act( "click" )
data: ExtractResult = await stagehand.extract( ... )
return data
Resources
Python SDK Repository Source code and examples
TypeScript SDK Compare with TypeScript API
API Server Run your own API server
Support
For help with the Python SDK: