Documentation Index
Fetch the complete documentation index at: https://mintlify.com/alibaba/page-agent/llms.txt
Use this file to discover all available pages before exploring further.
AgentConfig is the primary configuration interface for both PageAgent and PageAgentCore. It extends LLMConfig and covers every behavioral, lifecycle, and capability option available when constructing an agent instance. All LLM connection fields are inherited directly — there is no need to pass a separate config object.
LLM Connection Fields
These fields are inherited fromLLMConfig and control how the agent communicates with the LLM backend.
Base URL of the OpenAI-compatible LLM API endpoint. Examples:
https://api.openai.com/v1, http://localhost:11434/v1.Model identifier as accepted by the provider, for example
gpt-4.1-mini, qwen3.5-plus, or qwen3:14b.API key for the LLM provider. Omit for local runtimes (Ollama, LM Studio) or when authentication is handled by
customFetch.Sampling temperature. Deprecated — many models reject this parameter outright. Use
transformRequestBody to set it only for models you have verified accept it.Number of times the agent will retry a failed LLM call before propagating an error. Defaults to
3.When
true, removes the tool_choice field from every LLM request. Required for LM Studio and other local models that reject named tool choice with "Invalid tool_choice type: 'object'".Intercept and transform the final request body before it is sent to the provider. Return a new object, or mutate the input and return
undefined. Typical use cases include adding prompt-caching hints or provider-specific flags.Custom
fetch implementation for all LLM API requests. Use this to inject authentication headers, route traffic through a backend proxy, or override CORS behaviour in browser environments.Agent Behaviour Fields
UI language for agent-generated messages and prompts. Currently supports English (
en-US) and Simplified Chinese (zh-CN).Maximum number of tool-call steps the agent may execute for a single task. The agent stops and returns an error result if this limit is reached before the task is marked done.
Minimum pause between consecutive steps, in seconds. Increase this value to slow the agent down on pages that animate or load data asynchronously.
Extend or override the agent’s built-in tool set. Keys are tool names; values are
PageAgentTool objects or null to remove a built-in tool entirely.Instructions that guide agent behaviour across all tasks or on specific pages.
Transform the simplified page content after DOM extraction but before it is sent to the LLM. Use this to inspect extraction output, inject additional context, or mask sensitive data.
Completely replaces the default system prompt. Use with caution — an incorrect prompt can break agent reasoning and tool-call behaviour.
Enables the
execute_javascript tool, which lets the agent run LLM-generated JavaScript directly on the page. Can cause unpredictable side effects and may bypass data-masking applied by transformPageContent.When
true, the agent fetches /llms.txt from the current site origin and includes its contents as additional context. Fetched at most once per origin per task.Lifecycle Hooks
All lifecycle hooks are experimental. Their signatures may change in future minor versions without a deprecation period.
Called once immediately before task execution begins. Receives the
PageAgentCore instance. Useful for resetting UI state or injecting observations before the first step.Called once after task execution completes, regardless of success or failure. Receives the final
ExecutionResult.Called before each individual step.
stepCount is zero-indexed. Use this to inject per-step observations or enforce custom pause logic.Called after each step completes. Receives the full history array up to and including the step that just finished. Suitable for streaming progress to external systems.
Called when the agent instance is disposed. Receives an optional reason string.