Skip to main content

Overview

Livestreaming allows you to watch Skyvern’s browser viewport in real-time as it navigates websites and performs actions. This is invaluable for debugging workflows, understanding agent behavior, and intervening when necessary.

Key Features

  • Real-time Viewport Streaming: Watch exactly what Skyvern sees in the browser
  • Debug Mode: Step through workflows and observe decision-making
  • VNC Integration: Connect to persistent browser sessions
  • Recording Playback: Review past runs with video recordings

How Livestreaming Works

Skyvern uses Chrome DevTools Protocol (CDP) and VNC to stream the browser viewport:
[Browser] <--CDP--> [Skyvern Server] <--WebSocket--> [Your Browser/Client]
The livestream provides:
  • Real-time visual feedback
  • Current page state
  • Action execution visualization
  • JavaScript evaluation results

Accessing Livestreams

Via Web UI

The easiest way to access livestreams is through the Skyvern web interface:
  1. During Task Execution
    • Navigate to your task or workflow run
    • Click the “Watch Live” button
    • The browser viewport streams in real-time
  2. Debug Sessions
    • Open workflow builder
    • Click “Debug” mode
    • Livestream starts automatically

Via API

Connect to livestream programmatically:
import asyncio
import websockets

async def watch_livestream(session_id: str, api_key: str):
    """Connect to livestream WebSocket"""
    uri = f"wss://api.skyvern.com/api/v1/stream/browser/{session_id}"
    
    async with websockets.connect(
        uri,
        extra_headers={"x-api-key": api_key}
    ) as websocket:
        print("Connected to livestream")
        
        async for message in websocket:
            # Handle stream data
            print(f"Received: {len(message)} bytes")

# Run livestream watcher
await watch_livestream("pbs_123456789", "your-api-key")

Debug Sessions

Debug sessions provide persistent browser sessions with livestreaming enabled:

Creating Debug Sessions

from skyvern import Skyvern

skyvern = Skyvern(api_key="your-api-key")

# Debug sessions are created automatically in the UI
# Or via API:
debug_session = await skyvern.create_debug_session(
    workflow_permanent_id="wpid_123",
    timeout_minutes=240  # 4 hour timeout
)

print(f"Debug session: {debug_session.debug_session_id}")
print(f"Browser session: {debug_session.browser_session_id}")
print(f"VNC URL: {debug_session.vnc_url}")

Debug Session Features

class DebugSession(BaseModel):
    debug_session_id: str           # Unique session ID
    organization_id: str            # Your org ID
    user_id: str                    # User who created session
    workflow_permanent_id: str      # Associated workflow
    browser_session_id: str         # Browser instance
    vnc_url: str | None            # VNC connection URL
    status: str                     # created, running, completed, failed
    created_at: datetime
    modified_at: datetime
    deleted_at: datetime | None

Using Debug Sessions

# Get existing debug session
debug_session = await skyvern.get_debug_session(
    workflow_permanent_id="wpid_123"
)

# Run workflow in debug mode
workflow_run = await skyvern.run_workflow(
    workflow_id="wpid_123",
    browser_session_id=debug_session.browser_session_id,
    parameters={"test_mode": True}
)

# Watch livestream while workflow runs
print(f"Watch at: {debug_session.vnc_url}")

VNC Integration

Skyvern uses VNC (Virtual Network Computing) for viewport streaming:

VNC Channel Architecture

# VNC Channel connects to browser session
class VncChannel:
    organization_id: str
    browser_session: PersistentBrowserSession
    x_api_key: str | None
    
    async def connect(self) -> None:
        """Establish VNC connection to browser"""
        # Connects to browser's VNC server
        
    async def evaluate_js(self, expression: str) -> Any:
        """Execute JavaScript in browser"""
        # Run JS in the live browser

Connecting via VNC Client

Use a VNC client to connect directly:
# Get VNC URL from debug session
# Example: vnc://browser.skyvern.com:5900?session=pbs_123

# Connect with VNC client
vncviewer browser.skyvern.com:5900
VNC connections are authenticated using your API key. Include it in the connection headers.

CDP (Chrome DevTools Protocol) Streaming

For programmatic access, use CDP directly:

CDP Channel

from skyvern.forge.sdk.routes.streaming.channels.cdp import CdpChannel

class CdpChannel:
    """Connect to browser via Chrome DevTools Protocol"""
    
    async def connect(self, cdp_url: str | None = None) -> None:
        """Connect to browser CDP endpoint"""
        # Default: ws://localhost:9222/devtools/browser/{id}
        
    async def evaluate_js(
        self,
        expression: str,
        arg: Any = None
    ) -> Any:
        """Evaluate JavaScript in browser"""
        page = self.page
        result = await page.evaluate(expression, arg)
        return result

Example: Custom CDP Connection

from playwright.async_api import async_playwright

async def connect_to_livestream(cdp_url: str):
    """Connect to browser via CDP for monitoring"""
    async with async_playwright() as p:
        browser = await p.chromium.connect_over_cdp(cdp_url)
        
        # Get the first page
        context = browser.contexts[0]
        page = context.pages[0] if context.pages else await context.new_page()
        
        # Monitor navigation
        page.on("framenavigated", lambda frame: 
            print(f"Navigated to: {frame.url}")
        )
        
        # Watch for console messages
        page.on("console", lambda msg:
            print(f"Console: {msg.text}")
        )
        
        # Monitor network
        page.on("request", lambda request:
            print(f"Request: {request.url}")
        )
        
        # Keep connection alive
        await asyncio.sleep(3600)  # 1 hour

Recording Playback

All runs are automatically recorded and can be played back:

Accessing Recordings

task = await skyvern.run_task(
    prompt="Navigate and extract data",
    url="https://example.com"
)

# Get recording URL
if task.recording_url:
    print(f"Watch recording: {task.recording_url}")
    # URL is typically: https://recordings.skyvern.com/{run_id}.mp4

Via Web UI

  1. Open any completed task or workflow run
  2. Click “View Recording” tab
  3. Playback controls:
    • Play/Pause
    • Speed control (0.5x, 1x, 2x)
    • Timeline scrubbing
    • Step-by-step navigation

Recording Schema

class BaseRunResponse(BaseModel):
    recording_url: str | None  # URL to MP4 recording
    screenshot_urls: list[str] | None  # Screenshots at each step

Use Cases

1. Debugging Failed Workflows

# Run workflow and immediately watch
workflow_run = await skyvern.run_workflow(
    workflow_id="wpid_complex_form",
    parameters={"test_data": "..."}
)

# If run fails, review recording
if workflow_run.status == "failed":
    print(f"Failed: {workflow_run.failure_reason}")
    print(f"Watch what happened: {workflow_run.recording_url}")
    
    # Check screenshots at failure point
    for screenshot in workflow_run.screenshot_urls:
        print(f"Screenshot: {screenshot}")

2. Real-time Intervention

# Start workflow in debug session
debug_session = await skyvern.get_debug_session("wpid_123")

# Watch livestream and manually intervene if needed
# (Via VNC or CDP connection)
browser = await connect_to_session(debug_session.browser_session_id)

# Monitor and step in if something goes wrong
await monitor_and_intervene(browser)

3. Quality Assurance

# Record all production runs
for item in batch_items:
    run = await skyvern.run_workflow(
        workflow_id="wpid_production",
        parameters=item
    )
    
    # Log recording URL for QA review
    log_for_qa_review({
        "run_id": run.run_id,
        "recording": run.recording_url,
        "status": run.status
    })

4. Training and Documentation

# Create reference recordings for documentation
reference_run = await skyvern.run_task(
    prompt="Complete the standard onboarding workflow",
    url="https://app.example.com/onboard"
)

# Use recording in training materials
training_materials.add_video(
    title="Standard Onboarding Process",
    url=reference_run.recording_url
)

Best Practices

When to Use Livestreaming

  1. Development and Testing
    • Building new workflows
    • Debugging complex interactions
    • Understanding website behavior
  2. Production Monitoring
    • Critical workflows that need oversight
    • High-value operations requiring verification
    • Troubleshooting production issues
  3. Training and Documentation
    • Creating workflow documentation
    • Training team members
    • Demonstrating capabilities

Resource Considerations

# Livestreaming uses additional resources
# Use debug sessions with appropriate timeouts

debug_session = await skyvern.create_debug_session(
    workflow_permanent_id="wpid_123",
    timeout_minutes=60  # Reasonable timeout
)

# Clean up when done
await skyvern.delete_debug_session(debug_session.debug_session_id)

Security

# Livestreams are authenticated
# Never share VNC URLs or API keys
# Sessions are organization-scoped

# Check permissions before streaming
if user.has_permission("view_livestream"):
    vnc_url = debug_session.vnc_url
else:
    raise PermissionError("Not authorized to view livestream")

Troubleshooting

Connection Issues

# If livestream won't connect:
# 1. Verify browser session is active
session = await skyvern.get_browser_session(session_id)
if session.status != "running":
    print("Session not active")

# 2. Check API key permissions
# 3. Ensure WebSocket connection is allowed
# 4. Verify network/firewall settings

Performance Issues

# If livestream is laggy:
# 1. Check network bandwidth
# 2. Reduce browser viewport size
# 3. Use recording playback instead of live stream
# 4. Close unnecessary browser tabs

Debug Session Timeout

# Extend timeout for long debugging sessions
try:
    await skyvern.renew_browser_session(
        session_id=debug_session.browser_session_id
    )
except BrowserSessionNotRenewable:
    # Create new debug session
    new_session = await skyvern.create_debug_session(
        workflow_permanent_id="wpid_123"
    )

Build docs developers (and LLMs) love