Multi-Page Browser Automation with the Chrome Extension

The Page Agent Chrome extension is an optional companion to the page-agent JavaScript library. While PageAgent.js handles in-page automation on its own, the extension adds three additional capabilities: running tasks that span multiple tabs, browser-level navigation control, and the ability to trigger automation from page JavaScript or from external agent systems (such as local MCP servers or cloud agents) via the window.PAGE_AGENT_EXT API.

Key Features

Multi-Page Tasks

Run tasks across multiple pages and tabs without being limited to a single page context.

Browser-Level Control

Enable richer automation including cross-tab navigation, page switching, and tab management.

Open Integration API

With explicit user authorization, page JS, local agents, or cloud agents can trigger multi-page tasks through the extension.

Installation

Install the extension

Install from the Chrome Web Store (stable) or GitHub Releases (faster updates):

Chrome Web Store: Page Agent Ext
GitHub Releases: alibaba/page-agent/releases

Install type definitions (recommended)

Add @page-agent/core to your project for full TypeScript support:

npm install @page-agent/core --save-dev

Set the auth token

Open the extension side panel, copy your auth token, then set it in your page’s localStorage. The extension will only expose window.PAGE_AGENT_EXT to pages that present a matching token.

// Set this in your trusted application only
localStorage.setItem('PageAgentExtUserAuthToken', 'your-token-from-extension')

Never share the auth token with untrusted pages or third-party scripts. The extension has broad browser permissions — token-based access ensures only applications you explicitly trust can trigger automation.

Quick Start

After setting the auth token, wait for the extension to inject window.PAGE_AGENT_EXT, then call execute:

import type { AgentActivity, AgentStatus, ExecutionResult, HistoricalEvent } from '@page-agent/core'

// Wait for extension injection (up to 1 second)
async function waitForExtension(timeout = 1000): Promise<boolean> {
  const start = Date.now()
  while (Date.now() - start < timeout) {
    if (window.PAGE_AGENT_EXT) return true
    await new Promise((r) => setTimeout(r, 100))
  }
  return false
}

if (await waitForExtension()) {
  const result = await window.PAGE_AGENT_EXT!.execute(
    'Search for "page-agent" on GitHub and open the first result',
    {
      baseURL: 'https://api.openai.com/v1',
      apiKey: 'your-api-key',
      model: 'gpt-5.2',
      onStatusChange: (status) => console.log('Status:', status),
      onActivity: (activity) => console.log('Activity:', activity),
    }
  )
  console.log('Result:', result)
}

API Reference

`PAGE_AGENT_EXT.execute(task, config)`

Executes a natural-language browser task. Returns a Promise<ExecutionResult> that resolves when the task completes (or fails).

task

string

required

Natural-language description of the task to perform.

config

ExecuteConfig

required

LLM settings, scope options, and event callbacks. See the table below.

ExecuteConfig properties:

Property	Type	Required	Description
`baseURL`	`string`	Yes	LLM API endpoint URL
`model`	`string`	Yes	Model name
`apiKey`	`string`	No	LLM API key
`systemInstruction`	`string`	No	Global system-level instructions (equivalent to `AgentConfig.instructions.system`)
`includeInitialTab`	`boolean`	No	Whether to include the tab where `execute` was called. Default: `true`
`experimentalIncludeAllTabs`	`boolean`	No	Control all unpinned tabs in the window instead of only the tab group. Default: `false`
`onStatusChange`	`(status: AgentStatus) => void`	No	Called when agent lifecycle status changes
`onActivity`	`(activity: AgentActivity) => void`	No	Called for real-time activity updates (thinking, executing, etc.)
`onHistoryUpdate`	`(history: HistoricalEvent[]) => void`	No	Called after each step with the full event history

`PAGE_AGENT_EXT.stop()`

Sends a cancellation signal to the currently running task. The task will stop at the next cooperative cancellation point.

// Stop current task execution
window.PAGE_AGENT_EXT!.stop()

`PAGE_AGENT_EXT_VERSION`

A version string injected alongside the main API object. Use it to check extension capabilities before calling the API:

if (window.PAGE_AGENT_EXT_VERSION) {
  console.log('Extension version:', window.PAGE_AGENT_EXT_VERSION)
}

Window Type Declaration

If you prefer not to install @page-agent/core, add the following declaration to your project:

import type { AgentActivity, AgentStatus, ExecutionResult, HistoricalEvent } from '@page-agent/core'

interface ExecuteConfig {
  baseURL: string
  model: string
  apiKey?: string
  systemInstruction?: string
  includeInitialTab?: boolean
  experimentalIncludeAllTabs?: boolean
  onStatusChange?: (status: AgentStatus) => void
  onActivity?: (activity: AgentActivity) => void
  onHistoryUpdate?: (history: HistoricalEvent[]) => void
}

type Execute = (task: string, config: ExecuteConfig) => Promise<ExecutionResult>

declare global {
  interface Window {
    PAGE_AGENT_EXT_VERSION?: string
    PAGE_AGENT_EXT?: {
      version: string
      execute: Execute
      stop: () => void
    }
  }
}

Limitations

Normal browser windows only. The extension relies on the Chrome tab group API, which does not work in pop-up windows or PWA app windows. Run your tasks from a standard browser window.

Get Started

Features

Advanced

Multi-Page Browser Automation with the Chrome Extension

Key Features

Multi-Page Tasks

Browser-Level Control

Open Integration API

Installation

Quick Start

API Reference

`PAGE_AGENT_EXT.execute(task, config)`

`PAGE_AGENT_EXT.stop()`

`PAGE_AGENT_EXT_VERSION`

Window Type Declaration

Limitations

Build docs developers (and LLMs) love

Get Started

Features

Advanced

Documentation Index

​Key Features

Multi-Page Tasks

Browser-Level Control

Open Integration API

​Installation

​Quick Start

​API Reference

​PAGE_AGENT_EXT.execute(task, config)

​PAGE_AGENT_EXT.stop()

​PAGE_AGENT_EXT_VERSION

​Window Type Declaration

​Limitations

Build docs developers (and LLMs) love

Key Features

Installation

Quick Start

API Reference

`PAGE_AGENT_EXT.execute(task, config)`

`PAGE_AGENT_EXT.stop()`

`PAGE_AGENT_EXT_VERSION`

Window Type Declaration

Limitations