Overview
Stagehand is a browser automation framework that powers Genie Helper’s platform scraping and publishing workflows. It provides AI-driven browser control using natural language commands. Key Features:- AI-Powered Actions: “Click the login button” instead of CSS selectors
- Cookie Injection: Bypass login screens by reusing captured sessions
- Data Extraction: Extract structured JSON from any webpage
- Screenshot Capture: Visual debugging and content archival
- Stealth Mode: Anti-bot detection with randomized user agents
- Port: 3002
- Process:
pm2 stagehand-server - Browser: Local Playwright (Chromium)
- LLM: Ollama
qwen-2.5for action planning
Architecture
Stagehand in the Stack
MCP Integration
Stagehand is exposed to the AnythingLLM agent via the Stagehand MCP Server. Location:/home/daytona/workspace/source/scripts/stagehand-mcp-server.mjs:1
9 MCP Tools:
| Tool | Description |
|---|---|
start-session | Initialize new browser session |
navigate | Load URL in existing session |
act | Perform action via natural language (click, type, scroll) |
extract | Extract structured data using AI |
observe | List interactive elements on page |
screenshot | Capture current page as image |
get-cookies | Read all cookies from session |
set-cookies | Inject cookie array (for authenticated sessions) |
close-session | End session and cleanup browser |
Stagehand Models
LLM for Actions: Stagehand uses an Ollama model for understanding natural language actions and page structure. Default Model:ollama/qwen-2.5
Configuration:
- Fast inference (< 5s per action)
- Good DOM understanding
- Reliable JSON output
- Small memory footprint (< 5GB RAM)
ollama/dolphin-mistral:7b: Uncensored, good for adult content selectorsollama/llama3.2:3b: Faster but less accurateollama/phi-3.5: Lightweight fallback
Browser Sessions
Starting a Session
MCP Tool:start-session
Parameters:
headless(boolean, optional): Run browser without GUI (default: true)
Session Lifecycle
Each session is a dedicated Chromium browser instance:- Memory: ~300MB RAM per session
- Concurrency Limit: ~33 sessions (10GB RAM available)
- Auto-Cleanup: Sessions expire after 30 minutes of inactivity
- Manual Cleanup: Always call
close-sessionwhen done
Closing a Session
MCP Tool:close-session
Parameters:
session_id(string): Session ID fromstart-session
Scraping Workflows
Cookie-Based Authentication
The primary scraping pattern uses captured cookies to bypass login: Flow:- Capture Cookies: User logs into platform → browser extension captures cookies
- Store Encrypted: Cookies stored in
platform_sessionscollection (AES-256-GCM) - Start Session: Create new Stagehand browser session
- Inject Cookies: Use
set-cookiestool to inject captured cookies - Navigate: Load creator profile page (now authenticated)
- Extract Data: Use
extracttool to scrape stats, posts, earnings - Close Session: Clean up browser
stepExecutors.js:222):
Extracting Data
MCP Tool:extract
Parameters:
session_id(string): Active session IDinstruction(string): What to extract in natural languageschema(object, optional): JSON schema for structured output
Example: OnlyFans Profile Scrape
Full Workflow (fromseed_platform_scrape_flow.mjs:1):
Publishing Workflows
Cross-Platform Posting
Stagehand also handles automated content publishing to creator platforms. Supported Platforms:- OnlyFans
- Fansly
- TikTok
- X (Twitter)
Publishing Flow
Worker Job:publish_post (media-worker queue)
Location: /home/daytona/workspace/source/media-worker/index.js:854
Steps:
- Fetch Post Data: Get
scheduled_postsrecord + media file - Start Session: Create Stagehand session
- Inject Cookies: Load platform cookies from
platform_sessions - Navigate to Upload: Go to platform’s upload page
- Upload Media: Use
acttool to interact with file input - Set Caption: Type caption via
acttool - Submit Post: Click publish button via
acttool - Verify: Screenshot + extract post URL
- Update Record: Mark
scheduled_postsstatus aspublished - Close Session: Clean up browser
Example: OnlyFans Post
Action Sequence:Natural Language Actions
Using the act Tool
MCP Tool: act
Parameters:
session_id(string): Active session IDaction(string): Natural language description of action
- Click: “Click the Login button”, “Click the 3rd profile image”
- Type: “Type ‘[email protected]’ into the email field”
- Scroll: “Scroll down 500 pixels”, “Scroll to the bottom of the page”
- Select: “Select ‘Premium’ from the subscription dropdown”
- Upload: “Upload the file ‘image.jpg’ to the file input”
- Wait: “Wait for the loading spinner to disappear”
Action Reliability
Factors Affecting Success:- Page Complexity: Simple forms work better than complex SPAs
- Dynamic Content: React/Vue apps may need wait times
- CAPTCHA: Cannot bypass human verification
- Rate Limiting: Repeated actions may trigger bot detection
- Use cookie injection instead of login automation when possible
- Add explicit waits after navigation:
"Wait 3 seconds" - Use screenshots to debug failed actions
- Fallback to CSS selectors for critical paths
Cookie Management
Setting Cookies
MCP Tool:set-cookies
Parameters:
session_id(string): Active session IDcookies(array): Cookie objects
Getting Cookies
MCP Tool:get-cookies
Parameters:
session_id(string): Active session ID
Screenshots
Capturing Screenshots
MCP Tool:screenshot
Parameters:
session_id(string): Active session ID
- Debug failed extractions
- Verify successful posts
- Archive page states
- HITL error reporting
Storing Screenshots
To save screenshots to Directus:Performance & Limits
Resource Usage
Per Session:- RAM: ~300MB
- CPU: ~10% during active browsing, ~0% idle
- Disk: ~50MB temp files (auto-cleanup)
- Total RAM: 10GB available for Stagehand
- Max Sessions: ~33 concurrent (10GB / 300MB)
- Current Usage: ~4-6 sessions during peak hours
Action Latency
Typical Timings:start-session: 2-5 secondsnavigate: 1-3 secondsact: 3-8 seconds (includes LLM inference)extract: 5-15 seconds (depends on page complexity)screenshot: 0.5-1 secondclose-session: 0.5-1 second
- Reuse sessions for multiple operations
- Cache extraction results (see
extractCache.js) - Use parallel sessions for bulk scraping
- Disable screenshots unless debugging
Extraction Caching
Location:/home/daytona/workspace/source/server/utils/cache/extractCache.js:2
TTL: 5 minutes
Cache Key: Hash of sessionId + instruction + schema
Usage (Action Runner):
Stealth & Anti-Detection
Browser Fingerprinting
Stagehand uses stealth techniques to avoid bot detection: Launch Args (stagehand-mcp-server.mjs:50):
User Agent Rotation
Cookie reuse includes original user agent: Flow:- Browser extension captures cookies +
navigator.userAgent - Stored in
platform_sessions.user_agent - Stagehand session sets matching user agent before cookie injection
Rate Limiting
To avoid platform bans:- Scrape Frequency: Configurable via
creator_profiles.scrape_frequency(cron) - Recommended: Every 6 hours (
0 */6 * * *) - Minimum: Every 1 hour (more frequent = higher risk)
Error Handling
Common Errors
1. Session Not Found- Take a screenshot to verify page state
- Use more specific action description
- Add a wait before action:
"Wait 2 seconds then click Login"
- Simplify schema (fewer fields)
- Navigate to simpler sub-page
- Use
observetool to verify content exists
Error Propagation
Stagehand MCP server surfaces HTTP errors to the agent: MCP Error Handling (stagehand-mcp-server.mjs:26):
Troubleshooting
Check Stagehand Service
Restart Stagehand
Test Session Manually
Debug Failed Extraction
- Take screenshot before extraction
- Use
observetool to list available elements - Simplify instruction to single field
- Check Ollama model is running:
curl http://localhost:11434/api/tags
Related Documentation
- Browser Extension - Cookie capture for authenticated sessions
- Directus CMS - Collections for jobs and scraped data
- Platform Connections - Managing creator credentials
