Page Agent can be added to any webpage in a single script tag or a one-line npm install. This guide walks you through both integration paths, explains every configuration field, and shows you how to listen to agent events and safely handle API credentials in production.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/alibaba/page-agent/llms.txt
Use this file to discover all available pages before exploring further.
Choose an integration path
Pick the approach that matches your project setup:
- CDN (Quick Try)
- npm / yarn / pnpm
The CDN bundle includes a pre-configured demo LLM, so you can evaluate Page Agent without an API key. Just drop one Two mirrors are available:
By default the script auto-initialises a demo agent when it loads. Add
<script> tag into your HTML:index.html
| Mirror | URL |
|---|---|
| Global (jsDelivr) | https://cdn.jsdelivr.net/npm/page-agent@1.11.0/dist/iife/page-agent.demo.js |
| China (npmmirror) | https://registry.npmmirror.com/page-agent/1.11.0/files/dist/iife/page-agent.demo.js |
?autoInit=false to the URL to suppress auto-init; you can then create your own instance with new window.PageAgent(...).Create an agent instance
Construct a
PageAgent with your LLM credentials and preferred language. The agent attaches itself to the current page automatically:agent.ts
Configuration fields
The model name as expected by your LLM provider — e.g.
"qwen3.5-plus", "gpt-4o", "llama3.2". Must support structured tool/function calling.Base URL of the OpenAI-compatible API endpoint. Examples:
- Alibaba DashScope:
https://dashscope.aliyuncs.com/compatible-mode/v1 - OpenAI:
https://api.openai.com/v1 - Local Ollama:
http://localhost:11434/v1
API key for your LLM provider. Required for cloud providers; can be any non-empty string for local models (Ollama, LM Studio) that don’t check the key.
UI and system-prompt language. Controls the language the built-in panel and agent responses use.
Maximum number of Re-Act loop iterations per task. Increase for very long multi-step workflows; lower to cap LLM spend.
Seconds to wait between steps. Increase this if pages need extra time to settle after an action.
Global system-level instructions injected into every LLM prompt. Use this to describe your application, restrict agent scope, or define domain terminology.
Execute a task
Call You can also show the built-in floating panel and let the user type instructions directly:
agent.execute() with a natural-language task string. It returns a promise that resolves with an ExecutionResult when the task finishes — whether the agent calls done, hits maxSteps, or encounters an unrecoverable error. The promise only rejects for pre-flight failures (e.g. calling execute() while already running):agent.ts
Listen to agent events
PageAgent extends EventTarget. Subscribe to events for real-time UI feedback and debugging:agent-events.ts
| Event | When it fires | Payload |
|---|---|---|
statuschange | Agent status transitions | agent.status |
activity | Real-time step activity (transient) | AgentActivity on e.detail |
historychange | History array mutated | agent.history |
dispose | Agent is cleaned up | — |
Stop or dispose the agent
Use
stop() to cancel a running task gracefully. Use dispose() when you are done with the agent instance entirely (e.g. on component unmount):agent-lifecycle.ts
Never
await agent.stop() inside a lifecycle hook (onBeforeStep, onAfterStep, etc.) — that would cause a deadlock. Call stop() from outside the agent’s own execution context.Production: Securing Your API Key
PassingapiKey directly in frontend code means the key is visible in your JavaScript bundle and network requests. For any production deployment, proxy the LLM call through your own backend and use customFetch to intercept requests:
secure-agent.ts
Next Steps
Supported Models
Browse tested LLMs, including the free evaluation API and local model options.
Troubleshooting
Diagnose and fix the most common setup and runtime issues.