Overview
Genie Helper uses Stagehand (Playwright + vision LLM) to scrape creator profiles from OnlyFans, Fansly, and other platforms. The scraper supports cookie-based authentication, username/password login, and Twitter/X OAuth flows. Operation:scrape_profile (media-worker)Browser: Local Playwright (headless Chrome)
Vision LLM:
ollama/qwen-2.5 for page understanding
Architecture
Scraping Flow
Authentication Methods
The scraper supports 3 authentication types (stored inplatform_connections.auth_type):
1. Cookie-Only (Recommended)
Flow: User logs in via browser extension → Extension captures cookies → Scraper injects cookies on next run Pros:- Most reliable (no credential validation)
- Works with 2FA, Google SSO, magic links
- Bypasses bot detection
- Requires manual login once every 30-90 days (cookie expiry)
2. Email/Password
Flow: Scraper navigates to login page → Fills email + password fields → Clicks sign-in button Pros:- Fully automated (no user interaction)
- Works for platforms without 2FA
- May trigger CAPTCHA or bot detection
- Fails if 2FA is enabled
media-worker/index.js:731-739
3. Twitter/X OAuth
Flow: Scraper clicks “Sign in with X” → Fills X credentials → OnlyFans/Fansly redirects back Supported platforms: OnlyFans, FanslyCredentials:
x_username + x_password (separate from platform credentials)
Implementation: media-worker/index.js:708-728
HITL (Human-in-the-Loop) System
HITL is triggered when:- No cookies available in
platform_sessions - No credentials stored (auth_type:
cookie_only) - Login fails (CAPTCHA, 2FA, or expired cookies)
Flow
-
Scraper creates
hitl_sessionsrecord: -
Dashboard shows yellow banner:
-
User clicks “Download Extension” → Installs from
public/extension/ - User navigates to platform and logs in normally
-
Extension captures cookies → Sends to
/api/credentials/store-platform-session -
Backend encrypts cookies → Stores in
platform_sessions.encrypted_cookies - Dashboard shows green checkmark → User clicks “Let’s Go” again
- Scraper injects cookies → Bypass login wall → Success
dashboard/src/pages/Dashboard/index.jsx (banner) + browser extension
Data Extraction
Profile Stats
Extracted fields:follower_count— Total subscribers/followerspost_count— Total posts publishedsubscription_price— Monthly price (e.g., “$9.99” or “Free”)bio_text— Profile biography
media-worker/index.js:762-770
Recent Posts
Extracted per post:caption— Post text (max 500 chars)posted_at— Date string (e.g., “Jan 15” or “2 days ago”)likes_count— Like count (0 if not shown)comments_count— Comment count (0 if not shown)
Storage:
scraped_media collection
Implementation: media-worker/index.js:776-800
Supported Platforms
| Platform | Status | Auth Methods | Notes |
|---|---|---|---|
| OnlyFans | ✅ Full | Cookie, Email, X OAuth | Main platform |
| Fansly | ✅ Full | Cookie, Email, X OAuth | Similar to OF |
| 🚧 Partial | Cookie only | High bot detection | |
| TikTok | 🚧 Partial | Cookie only | Requires mobile user-agent |
| X/Twitter | 🚧 Partial | Cookie only | Rate limits apply |
| 🚧 Partial | Cookie, Password | Subreddit-specific | |
| Patreon | 📅 Planned | Cookie | Roadmap |
| ManyVids | 📅 Planned | Cookie | Roadmap |
media-worker/index.js:641-645
Scrape Status States
Stored inplatform_connections.scrape_status:
| Status | Meaning | Next Action |
|---|---|---|
idle | Ready to scrape | Click “Scrape Now” |
scraping | In progress | Wait (auto-updates) |
hitl_required | Login needed | Install extension + log in |
failed | Error occurred | Check error message + retry |
media-worker/index.js:629,754,811,849
Browser Extension
Path:public/extension/ (Firefox + Chrome manifest)Size: ~15KB (no external dependencies)
Features
- Captures cookies on command (user clicks extension icon)
- Encrypts cookies client-side (AES-256-GCM)
- Sends to
/api/credentials/store-platform-session - Auto-detects platform from current URL
- Works on all 18 supported platforms
Installation
Firefox:- Download
extension.zipfrom Dashboard - Open
about:debugging#/runtime/this-firefox - Click “Load Temporary Add-on”
- Select
manifest.json
- Download
extension.zip - Open
chrome://extensions - Enable “Developer mode”
- Click “Load unpacked” → Select extension folder
Metadata Stripping
All scraped images are auto-stripped of EXIF/GPS metadata before upload to Directus. Implementation:media-worker/index.js:817-835
Logs & Debugging
Common Issues
| Error | Cause | Fix |
|---|---|---|
HITL_REQUIRED | No cookies + no credentials | Install extension + log in |
Stagehand timeout | Page load >30s | Check internet connection |
Login wall detected | Cookies expired | Re-capture cookies via extension |
screenshot failed | Playwright crash | Restart stagehand-server |
Related
- Media Processing — Media worker operations
- Dashboard — Scrape trigger UI
- AI Agent — Stagehand MCP tools
