Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/alibaba/page-agent/llms.txt

Use this file to discover all available pages before exploring further.

All public types are exported from the page-agent package (re-exported from @page-agent/core). Import them directly for full TypeScript type safety in your application code or build tooling integrations.
import type {
  ExecutionResult,
  AgentStatus,
  AgentActivity,
  HistoricalEvent,
  AgentStepEvent,
  AgentReflection,
  PageAgentTool,
  SupportedLanguage,
} from 'page-agent'

AgentStatus

Represents the lifecycle state of an agent instance. Transitions flow from idlerunningcompleted | error | stopped.
type AgentStatus = 'idle' | 'running' | 'completed' | 'error' | 'stopped'
idle
string
The agent has been constructed but has not yet started a task, or the previous task has been fully torn down.
running
string
A task is currently in progress. The agent is executing steps.
completed
string
The most recent task finished successfully — the LLM called the done tool with success: true.
error
string
The task terminated due to an unrecoverable error (LLM error, step limit exceeded, etc.).
stopped
string
The task was cancelled by a user or programmatic call to agent.stop().

AgentActivity

Ephemeral, real-time state describing what the agent is doing right now. Unlike HistoricalEvent, activities are not persisted. Absence of an activity event means the agent is idle. Listen to the activity event on the agent instance to update live UI.
type AgentActivity =
  | { type: 'thinking' }
  | { type: 'executing'; tool: string; input: unknown }
  | { type: 'executed'; tool: string; input: unknown; output: string; duration: number }
  | { type: 'retrying'; attempt: number; maxAttempts: number }
  | { type: 'error'; message: string }
thinking
object
The LLM is generating its next action.
executing
object
A tool call has been dispatched and is awaiting its result.
executed
object
A tool call completed successfully.
retrying
object
The LLM call failed and a retry is about to be attempted.
error
object
A terminal error has occurred.

ExecutionResult

Returned by agent.execute(task) and by the Extension API’s window.PAGE_AGENT_EXT.execute().
interface ExecutionResult {
  success: boolean
  data: string
  history: HistoricalEvent[]
}
success
boolean
true if the agent called done with success: true; false on error or user stop.
data
string
The agent’s final summary text — the value passed to the done tool’s text parameter.
history
HistoricalEvent[]
Complete ordered list of all events that occurred during the task. See HistoricalEvent below.

HistoricalEvent

A persisted record of something that occurred during a task. Stored in ExecutionResult.history and streamed via onAfterStep and onHistoryUpdate.
type HistoricalEvent =
  | AgentStepEvent
  | ObservationEvent
  | UserTakeoverEvent
  | RetryEvent
  | AgentErrorEvent
AgentStepEvent
object
A completed LLM reasoning + tool-execution cycle. See AgentStepEvent for the full structure.
observation
object
A persistent observation injected into the agent’s memory. Stays in context for the duration of the task.
user_takeover
object
Marks a point at which the user briefly took manual control of the browser.
retry
object
Records that an LLM call was retried.
error
object
A fatal error event that ended the task.

AgentStepEvent

One complete step: the LLM reflected on the previous action and chose a new one.
interface AgentStepEvent {
  type: 'step'
  stepIndex: number
  reflection: Partial<AgentReflection>
  action: {
    name: string
    input: any
    output: string
  }
  usage: {
    promptTokens: number
    completionTokens: number
    totalTokens: number
    cachedTokens?: number
    reasoningTokens?: number
  }
  rawResponse?: unknown
  rawRequest?: unknown
}
type
'step'
Discriminant literal — always 'step'.
stepIndex
number
Zero-indexed position of this step within the task.
reflection
Partial<AgentReflection>
The LLM’s self-reflection before acting. See AgentReflection.
action
object
The tool call made during this step.
usage
object
Token consumption for this step’s LLM call.
rawResponse
unknown
Unprocessed LLM response object. Useful for debugging provider-specific fields.
rawRequest
unknown
Unprocessed LLM request object. Useful for inspecting exactly what was sent.

AgentReflection

The structured reasoning state the LLM must produce before every tool call. Enforces a reflection-before-action mental model.
interface AgentReflection {
  evaluation_previous_goal: string
  memory: string
  next_goal: string
}
evaluation_previous_goal
string
The LLM’s assessment of whether the previous action achieved its intended goal.
memory
string
Key information the LLM wants to carry forward into subsequent steps.
next_goal
string
A concise statement of what the LLM intends to accomplish in the upcoming action.

PageAgentTool

The interface for both built-in and custom tools. The execute function runs with the PageAgentCore instance as this, giving it access to this.pageController and other agent internals.
interface PageAgentTool<TParams = any> {
  description: string
  inputSchema: z.ZodType<TParams>
  execute: (
    this: PageAgentCore,
    args: TParams,
    ctx: ToolContext
  ) => Promise<string>
}

interface ToolContext {
  signal: AbortSignal
}
description
string
Natural language description sent to the LLM to help it decide when to call this tool.
inputSchema
z.ZodType
Zod schema that defines and validates the tool’s input parameters. Converted to a JSON Schema for the LLM.
execute
function
Async function that performs the tool action. Must return a descriptive result string. Must honour ctx.signal — check signal.throwIfAborted() in any loop or pass it to fetch.

The tool() helper

Use the tool() helper for full TypeScript inference on execute’s parameter types.
import { tool } from 'page-agent'
// or: import { tool } from '@page-agent/core'
import { z } from 'zod/v4'

const myTool = tool({
  description: 'Fetch JSON from a URL and return it as a string.',
  inputSchema: z.object({
    url: z.string().url(),
  }),
  execute: async function (input, { signal }) {
    const res = await fetch(input.url, { signal })
    const data = await res.json()
    return JSON.stringify(data)
  },
})

SupportedLanguage

type SupportedLanguage = 'en-US' | 'zh-CN'
Used by AgentConfig.language. Controls the language of agent-generated UI text and internal prompts.

Build docs developers (and LLMs) love