Diff Testing

Overview

Agent Browser includes built-in diff commands for comparing page states, making it ideal for:

Visual regression testing
Detecting unintended changes between deployments
Comparing staging vs production
Monitoring page changes over time
Testing responsive design at different breakpoints

Diff capabilities include:

Snapshot diff - Text-based comparison using the Myers diff algorithm
Screenshot diff - Pixel-based visual comparison using Canvas API
URL diff - Compare two different URLs side-by-side

Snapshot Diff

Compare accessibility tree snapshots to detect structural changes.

Compare Current vs Last Snapshot

# Take initial snapshot
agent-browser open https://example.com
agent-browser snapshot > before.txt

# Make changes or navigate
agent-browser click @e1

# Compare current state vs previous
agent-browser diff snapshot --baseline before.txt

Output:

+ - link "New Feature" [ref=e7]
- - link "Old Feature" [ref=e5]
  - button "Submit" [ref=e2]
  - textbox "Email" [ref=e3]

Additions: 1
Removals: 1
Unchanged: 45

Scoped Snapshot Diff

Compare only a specific section of the page:

agent-browser diff snapshot --baseline before.txt --selector "#main-content"

Compact Diff

Remove empty structural elements for cleaner comparison:

agent-browser diff snapshot --baseline before.txt --compact

Implementation Details

The snapshot diff uses the Myers diff algorithm for efficient line-level comparison. See src/diff.ts for the implementation:

/**
 * Myers diff algorithm operating on arrays of lines.
 * Returns a minimal edit script.
 */
function myersDiff(a: string[], b: string[]): DiffEdit[] {
  const n = a.length;
  const m = b.length;
  const max = n + m;

  if (max === 0) return [];

  // Optimize: if both are identical, skip diff
  if (n === m) {
    let identical = true;
    for (let i = 0; i < n; i++) {
      if (a[i] !== b[i]) {
        identical = false;
        break;
      }
    }
    if (identical) return a.map((line) => ({ type: 'equal' as const, line }));
  }

  // ... Myers algorithm implementation
}

The algorithm produces a minimal edit script with three types of edits:

equal - Lines that are unchanged
insert - Lines added in the new snapshot
delete - Lines removed from the old snapshot

Screenshot Diff

Compare images pixel-by-pixel to detect visual changes.

Basic Screenshot Diff

# Take baseline screenshot
agent-browser open https://example.com
agent-browser screenshot baseline.png

# Make changes or wait for updates
agent-browser click @e1

# Compare current vs baseline
agent-browser diff screenshot --baseline baseline.png

Output:

Diff saved to: /home/user/.agent-browser/tmp/diffs/diff-1709654321000.png
Total pixels: 921600
Different pixels: 1842
Mismatch: 0.20%
Match: false

Custom Output Path

agent-browser diff screenshot --baseline baseline.png -o ./diffs/change.png

Adjust Color Threshold

Control sensitivity to color differences (0-1, default 0.1):

# More sensitive (detects subtle color changes)
agent-browser diff screenshot --baseline baseline.png -t 0.05

# Less sensitive (ignores minor variations)
agent-browser diff screenshot --baseline baseline.png -t 0.3

The threshold represents the maximum color distance (normalized 0-1) before pixels are marked as different. Lower values detect smaller changes.

Implementation Details

Screenshot diff uses the browser’s Canvas API for pixel comparison. See src/diff.ts:160-340:

/**
 * Compare two image buffers using the browser's Canvas API for pixel comparison.
 * Uses an isolated blank page to avoid CSP interference or DOM side effects on the
 * user's page. Images are served via intercepted routes to avoid large base64 payloads
 * through page.evaluate (which can be slow or hit CDP message size limits).
 */
export async function diffScreenshots(
  context: BrowserContext,
  baselineBuffer: Buffer,
  currentBuffer: Buffer,
  opts: { threshold?: number; outputPath?: string; baselineMime?: string }
): Promise<DiffScreenshotData> {
  const baselineMime = opts.baselineMime ?? 'image/png';
  const threshold = opts.threshold ?? 0.1;

  // Create isolated page and serve images via routes
  const nonce = Math.random().toString(36).slice(2, 10);
  const blankUrl = `${DIFF_ROUTE_PREFIX}/${nonce}/index.html`;
  const baselineUrl = `${DIFF_ROUTE_PREFIX}/${nonce}/baseline.png`;
  const currentUrl = `${DIFF_ROUTE_PREFIX}/${nonce}/current.png`;

  const diffPage = await context.newPage();

  // ... route setup and image loading

  // Pixel comparison algorithm:
  // For each pixel, calculate Euclidean distance in RGB space
  const maxColorDistance = threshold * 255 * Math.sqrt(3);
  for (let i = 0; i < totalPixels; i++) {
    const offset = i * 4;
    const rA = dataA[offset], gA = dataA[offset + 1], bA = dataA[offset + 2];
    const rB = dataB[offset], gB = dataB[offset + 1], bB = dataB[offset + 2];
    const dr = rA - rB, dg = gA - gB, db = bA - bB;
    const dist = Math.sqrt(dr * dr + dg * dg + db * db);
    if (dist > maxColorDistance) {
      differentPixels++;
      // Mark different pixels red in output
      diffData[offset] = 255;
      diffData[offset + 1] = 0;
      diffData[offset + 2] = 0;
      diffData[offset + 3] = 255;
    }
  }
  // ...
}

Key implementation details:

Isolated page: Diff runs in a separate page to avoid CSP issues
Route interception: Images served via routes instead of base64 in page.evaluate() to avoid CDP message size limits
Euclidean distance: Color differences calculated in RGB space
Output image: Different pixels highlighted in red, matching pixels dimmed to 30% brightness

Dimension Mismatch

If images have different dimensions, the diff will fail:

Dimension mismatch: baseline is 1920x1080, current is 1280x720
Mismatch: 100%

Ensure both screenshots use the same viewport size:

agent-browser set viewport 1920 1080
agent-browser screenshot baseline.png
# ... make changes ...
agent-browser screenshot current.png
agent-browser diff screenshot --baseline baseline.png

URL Diff

Compare two different URLs directly.

Snapshot Diff Between URLs

# Compare staging vs production
agent-browser diff url https://staging.example.com https://example.com

This opens both URLs, takes snapshots, and shows the diff.

Screenshot Diff Between URLs

agent-browser diff url https://v1.example.com https://v2.example.com --screenshot

Takes screenshots of both URLs and performs pixel comparison.

Custom Wait Strategy

Wait for specific load state before comparing:

# Wait for network idle (all requests complete)
agent-browser diff url https://staging.example.com https://example.com --wait-until networkidle

# Wait for DOM loaded
agent-browser diff url https://staging.example.com https://example.com --wait-until domcontentloaded

# Wait for page load event
agent-browser diff url https://staging.example.com https://example.com --wait-until load

Scoped URL Diff

Compare only a specific section:

# Compare main content area only
agent-browser diff url https://staging.example.com https://example.com --selector "#main-content"

# Compare with screenshot
agent-browser diff url https://staging.example.com https://example.com --screenshot --selector "#app"

Testing Workflows

Visual Regression Testing

Detect unintended visual changes between deployments:

#!/bin/bash
# visual-regression-test.sh

BASELINE_URL="https://production.example.com"
CURRENT_URL="https://staging.example.com"

# Compare snapshots
echo "Comparing accessibility trees..."
agent-browser diff url "$BASELINE_URL" "$CURRENT_URL" > snapshot-diff.txt

# Compare screenshots
echo "Comparing visual appearance..."
agent-browser diff url "$BASELINE_URL" "$CURRENT_URL" --screenshot -t 0.1 > screenshot-diff.txt

# Check for significant differences
MISMATCH=$(grep "Mismatch:" screenshot-diff.txt | awk '{print $2}' | tr -d '%')
if (( $(echo "$MISMATCH > 1.0" | bc -l) )); then
  echo "FAIL: Visual regression detected ($MISMATCH% different)"
  exit 1
fi

echo "PASS: No significant visual changes"

Responsive Design Testing

Compare page layout at different breakpoints:

#!/bin/bash
# responsive-diff-test.sh

URL="https://example.com"

# Desktop baseline
agent-browser set viewport 1920 1080
agent-browser open "$URL"
agent-browser screenshot desktop.png

# Tablet
agent-browser set viewport 768 1024
agent-browser open "$URL"
agent-browser screenshot tablet.png

# Mobile
agent-browser set viewport 375 667
agent-browser open "$URL"
agent-browser screenshot mobile.png

echo "Screenshots saved: desktop.png, tablet.png, mobile.png"

Monitoring Changes Over Time

Detect changes to a page over time:

#!/bin/bash
# monitor-changes.sh

URL="https://example.com/news"
BASELINE_DIR="./baselines"
DIFF_DIR="./diffs"
DATE=$(date +%Y-%m-%d-%H-%M-%S)

mkdir -p "$BASELINE_DIR" "$DIFF_DIR"

agent-browser open "$URL"
agent-browser wait --load networkidle

# Take current snapshot
agent-browser snapshot > "$BASELINE_DIR/snapshot-$DATE.txt"

# Compare with yesterday's baseline (if exists)
YESTERDAY=$(date -d "yesterday" +%Y-%m-%d)
BASELINE=$(ls "$BASELINE_DIR"/snapshot-"$YESTERDAY"*.txt 2>/dev/null | head -1)

if [ -f "$BASELINE" ]; then
  echo "Comparing with baseline: $BASELINE"
  agent-browser diff snapshot --baseline "$BASELINE" > "$DIFF_DIR/diff-$DATE.txt"
  
  ADDITIONS=$(grep "Additions:" "$DIFF_DIR/diff-$DATE.txt" | awk '{print $2}')
  REMOVALS=$(grep "Removals:" "$DIFF_DIR/diff-$DATE.txt" | awk '{print $2}')
  
  if [ "$ADDITIONS" -gt 0 ] || [ "$REMOVALS" -gt 0 ]; then
    echo "Changes detected: +$ADDITIONS -$REMOVALS"
    cat "$DIFF_DIR/diff-$DATE.txt"
  else
    echo "No changes detected"
  fi
else
  echo "No baseline found for comparison"
fi

agent-browser close

CI/CD Integration

Integrate diff testing into continuous integration:

# .github/workflows/visual-regression.yml
name: Visual Regression Testing

on:
  pull_request:
    branches: [main]

jobs:
  visual-regression:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '18'
      
      - name: Install dependencies
        run: |
          npm install -g agent-browser
          agent-browser install --with-deps
      
      - name: Run visual regression tests
        run: |
          # Compare PR preview vs production
          PREVIEW_URL="https://preview-${{ github.event.pull_request.number }}.example.com"
          PROD_URL="https://example.com"
          
          agent-browser diff url "$PROD_URL" "$PREVIEW_URL" --screenshot -o diff.png > diff-output.txt
          
          # Check mismatch percentage
          MISMATCH=$(grep "Mismatch:" diff-output.txt | awk '{print $2}' | tr -d '%')
          echo "Visual difference: $MISMATCH%"
          
          if (( $(echo "$MISMATCH > 2.0" | bc -l) )); then
            echo "::error::Visual regression detected ($MISMATCH% different)"
            exit 1
          fi
      
      - name: Upload diff artifacts
        if: failure()
        uses: actions/upload-artifact@v3
        with:
          name: visual-diff
          path: |
            diff.png
            diff-output.txt

Best Practices

Snapshot Diff

Use compact mode for cleaner diffs: --compact
Scope to relevant sections to reduce noise: --selector "#main"
Wait for dynamic content before diffing: agent-browser wait --load networkidle
Store baselines in version control for reproducibility

Screenshot Diff

Set consistent viewport before screenshots:
```
agent-browser set viewport 1920 1080
```
Adjust threshold based on your needs:
- Strict: -t 0.05 (detects subtle changes)
- Moderate: -t 0.1 (default, good balance)
- Lenient: -t 0.3 (ignores minor variations)

Wait for animations to complete:

agent-browser wait 2000  # Wait for animations
agent-browser screenshot current.png

Use full page screenshots for complete comparison:

agent-browser screenshot --full baseline.png

Hide dynamic elements (timestamps, ads) before comparison:

agent-browser eval --stdin <<'EOF'
document.querySelectorAll('.timestamp, .ad').forEach(el => el.style.visibility = 'hidden');
EOF
agent-browser screenshot current.png

URL Diff

Use —wait-until networkidle for dynamic pages
Scope to stable sections to avoid flaky diffs
Set viewport size for consistent layout

Authenticate before diffing if comparing logged-in pages:

agent-browser auth login myapp
agent-browser diff url https://staging.app.com/dashboard https://app.com/dashboard

Limitations

Snapshot Diff

Text-based only - Does not detect visual styling changes (colors, fonts, spacing)
Structure-dependent - Minor HTML restructuring can cause large diffs
Ref numbers change - Refs (@e1, @e2) are not stable across snapshots

Screenshot Diff

Dimension mismatch - Images must have identical dimensions
Dynamic content - Timestamps, ads, and animations cause false positives
Antialiasing differences - Font rendering can vary across environments
Performance - Pixel comparison can be slow for large images

Get Started

Core Concepts

Commands

Security

Advanced

Integrations

Guides

Overview

Snapshot Diff

Compare Current vs Last Snapshot

Scoped Snapshot Diff

Compact Diff

Implementation Details

Screenshot Diff

Basic Screenshot Diff

Custom Output Path

Adjust Color Threshold

Implementation Details

Dimension Mismatch

URL Diff

Snapshot Diff Between URLs

Screenshot Diff Between URLs

Custom Wait Strategy

Scoped URL Diff

Testing Workflows

Visual Regression Testing

Responsive Design Testing

Monitoring Changes Over Time

CI/CD Integration

Best Practices

Snapshot Diff

Screenshot Diff

URL Diff

Limitations

Snapshot Diff

Screenshot Diff

See Also

Build docs developers (and LLMs) love

Get Started

Core Concepts

Commands

Security

Advanced

Integrations

Guides

Documentation Index

​Overview

​Snapshot Diff

​Compare Current vs Last Snapshot

​Scoped Snapshot Diff

​Compact Diff

​Implementation Details

​Screenshot Diff

​Basic Screenshot Diff

​Custom Output Path

​Adjust Color Threshold

​Implementation Details

​Dimension Mismatch

​URL Diff

​Snapshot Diff Between URLs

​Screenshot Diff Between URLs

​Custom Wait Strategy

​Scoped URL Diff

​Testing Workflows

​Visual Regression Testing

​Responsive Design Testing

​Monitoring Changes Over Time

​CI/CD Integration

​Best Practices

​Snapshot Diff

​Screenshot Diff

​URL Diff

​Limitations

​Snapshot Diff

​Screenshot Diff

​See Also

Build docs developers (and LLMs) love

Overview

Snapshot Diff

Compare Current vs Last Snapshot

Scoped Snapshot Diff

Compact Diff

Implementation Details

Screenshot Diff

Basic Screenshot Diff

Custom Output Path

Adjust Color Threshold

Implementation Details

Dimension Mismatch

URL Diff

Snapshot Diff Between URLs

Screenshot Diff Between URLs

Custom Wait Strategy

Scoped URL Diff

Testing Workflows

Visual Regression Testing

Responsive Design Testing

Monitoring Changes Over Time

CI/CD Integration

Best Practices

Snapshot Diff

Screenshot Diff

URL Diff

Limitations

Snapshot Diff

Screenshot Diff

See Also