Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/alibaba/page-agent/llms.txt

Use this file to discover all available pages before exploring further.

AgentConfig is the primary configuration interface for both PageAgent and PageAgentCore. It extends LLMConfig and covers every behavioral, lifecycle, and capability option available when constructing an agent instance. All LLM connection fields are inherited directly — there is no need to pass a separate config object.

LLM Connection Fields

These fields are inherited from LLMConfig and control how the agent communicates with the LLM backend.
baseURL
string
required
Base URL of the OpenAI-compatible LLM API endpoint. Examples: https://api.openai.com/v1, http://localhost:11434/v1.
model
string
required
Model identifier as accepted by the provider, for example gpt-4.1-mini, qwen3.5-plus, or qwen3:14b.
apiKey
string
API key for the LLM provider. Omit for local runtimes (Ollama, LM Studio) or when authentication is handled by customFetch.
temperature
number
deprecated
Sampling temperature. Deprecated — many models reject this parameter outright. Use transformRequestBody to set it only for models you have verified accept it.
maxRetries
number
Number of times the agent will retry a failed LLM call before propagating an error. Defaults to 3.
disableNamedToolChoice
boolean
When true, removes the tool_choice field from every LLM request. Required for LM Studio and other local models that reject named tool choice with "Invalid tool_choice type: 'object'".
transformRequestBody
(body: Record<string, unknown>) => Record<string, unknown> | undefined
Intercept and transform the final request body before it is sent to the provider. Return a new object, or mutate the input and return undefined. Typical use cases include adding prompt-caching hints or provider-specific flags.
transformRequestBody: (body) => ({
  ...body,
  cache_control: { type: 'ephemeral' },
})
customFetch
typeof globalThis.fetch
Custom fetch implementation for all LLM API requests. Use this to inject authentication headers, route traffic through a backend proxy, or override CORS behaviour in browser environments.
customFetch: (url, init) =>
  fetch(url, { ...init, credentials: 'include' })

Agent Behaviour Fields

language
'en-US' | 'zh-CN'
default:"'en-US'"
UI language for agent-generated messages and prompts. Currently supports English (en-US) and Simplified Chinese (zh-CN).
maxSteps
number
default:"40"
Maximum number of tool-call steps the agent may execute for a single task. The agent stops and returns an error result if this limit is reached before the task is marked done.
stepDelay
number
default:"0.4"
Minimum pause between consecutive steps, in seconds. Increase this value to slow the agent down on pages that animate or load data asynchronously.
customTools
Record<string, PageAgentTool | null>
Extend or override the agent’s built-in tool set. Keys are tool names; values are PageAgentTool objects or null to remove a built-in tool entirely.
instructions
object
Instructions that guide agent behaviour across all tasks or on specific pages.
transformPageContent
(content: string) => string | Promise<string>
Transform the simplified page content after DOM extraction but before it is sent to the LLM. Use this to inspect extraction output, inject additional context, or mask sensitive data.
// Mask Chinese phone numbers before sending to LLM
transformPageContent: async (content) => {
  return content.replace(/1[3-9]\d{9}/g, '***********')
}
customSystemPrompt
string
Completely replaces the default system prompt. Use with caution — an incorrect prompt can break agent reasoning and tool-call behaviour.
experimentalScriptExecutionTool
boolean
default:"false"
Enables the execute_javascript tool, which lets the agent run LLM-generated JavaScript directly on the page. Can cause unpredictable side effects and may bypass data-masking applied by transformPageContent.
experimentalLlmsTxt
boolean
default:"false"
When true, the agent fetches /llms.txt from the current site origin and includes its contents as additional context. Fetched at most once per origin per task.

Lifecycle Hooks

All lifecycle hooks are experimental. Their signatures may change in future minor versions without a deprecation period.
onBeforeTask
(agent: PageAgentCore) => void | Promise<void>
Called once immediately before task execution begins. Receives the PageAgentCore instance. Useful for resetting UI state or injecting observations before the first step.
onAfterTask
(agent: PageAgentCore, result: ExecutionResult) => void | Promise<void>
Called once after task execution completes, regardless of success or failure. Receives the final ExecutionResult.
onBeforeStep
(agent: PageAgentCore, stepCount: number) => void | Promise<void>
Called before each individual step. stepCount is zero-indexed. Use this to inject per-step observations or enforce custom pause logic.
onAfterStep
(agent: PageAgentCore, history: HistoricalEvent[]) => void | Promise<void>
Called after each step completes. Receives the full history array up to and including the step that just finished. Suitable for streaming progress to external systems.
onDispose
(agent: PageAgentCore, reason?: string) => void
Called when the agent instance is disposed. Receives an optional reason string.

Full TypeScript Interface

import type { LLMConfig } from '@page-agent/llms'
import type { PageAgentCore } from '@page-agent/core'
import type { PageAgentTool } from '@page-agent/core'

export interface AgentConfig extends LLMConfig {
  // --- Inherited from LLMConfig ---
  baseURL: string
  model: string
  apiKey?: string
  /** @deprecated Use transformRequestBody instead */
  temperature?: number
  maxRetries?: number
  disableNamedToolChoice?: boolean
  transformRequestBody?: (
    body: Record<string, unknown>
  ) => Record<string, unknown> | undefined
  customFetch?: typeof globalThis.fetch

  // --- Agent behaviour ---
  language?: 'en-US' | 'zh-CN'
  maxSteps?: number
  stepDelay?: number
  customTools?: Record<string, PageAgentTool | null>
  instructions?: {
    system?: string
    getPageInstructions?: (url: string) => string | undefined | null
  }
  transformPageContent?: (content: string) => Promise<string> | string
  customSystemPrompt?: string
  experimentalScriptExecutionTool?: boolean
  experimentalLlmsTxt?: boolean

  // --- Lifecycle hooks (experimental) ---
  onBeforeTask?: (agent: PageAgentCore) => Promise<void> | void
  onAfterTask?: (
    agent: PageAgentCore,
    result: ExecutionResult
  ) => Promise<void> | void
  onBeforeStep?: (
    agent: PageAgentCore,
    stepCount: number
  ) => Promise<void> | void
  onAfterStep?: (
    agent: PageAgentCore,
    history: HistoricalEvent[]
  ) => Promise<void> | void
  onDispose?: (agent: PageAgentCore, reason?: string) => void
}

Build docs developers (and LLMs) love