Get Started with Page Agent in Minutes

Page Agent can be added to any webpage in a single script tag or a one-line npm install. This guide walks you through both integration paths, explains every configuration field, and shows you how to listen to agent events and safely handle API credentials in production.

Choose an integration path

Pick the approach that matches your project setup:

CDN (Quick Try)
npm / yarn / pnpm

The CDN bundle includes a pre-configured demo LLM, so you can evaluate Page Agent without an API key. Just drop one <script> tag into your HTML:

index.html

<script src="https://cdn.jsdelivr.net/npm/page-agent@1.11.0/dist/iife/page-agent.demo.js" crossorigin="true"></script>

Two mirrors are available:

Mirror	URL
Global (jsDelivr)	`https://cdn.jsdelivr.net/npm/page-agent@1.11.0/dist/iife/page-agent.demo.js`
China (npmmirror)	`https://registry.npmmirror.com/page-agent/1.11.0/files/dist/iife/page-agent.demo.js`

By default the script auto-initialises a demo agent when it loads. Add ?autoInit=false to the URL to suppress auto-init; you can then create your own instance with new window.PageAgent(...).

The demo CDN bundle uses a free testing LLM API provided by the Page Agent team. It is for technical evaluation only and subject to usage limits. By using it, you agree to the Terms of Use. Do not use it in production.

Install the package from the npm registry:

npm install page-agent

Then import the class in your application code:

import { PageAgent } from 'page-agent'

Create an agent instance

Construct a PageAgent with your LLM credentials and preferred language. The agent attaches itself to the current page automatically:

agent.ts

import { PageAgent } from 'page-agent'

const agent = new PageAgent({
  model: 'qwen3.5-plus',
  baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
  apiKey: 'YOUR_API_KEY',
  language: 'en-US',
})

Configuration fields

model

string

required

The model name as expected by your LLM provider — e.g. "qwen3.5-plus", "gpt-4o", "llama3.2". Must support structured tool/function calling.

baseURL

string

required

Base URL of the OpenAI-compatible API endpoint. Examples:

Alibaba DashScope: https://dashscope.aliyuncs.com/compatible-mode/v1
OpenAI: https://api.openai.com/v1
Local Ollama: http://localhost:11434/v1

apiKey

string

API key for your LLM provider. Required for cloud providers; can be any non-empty string for local models (Ollama, LM Studio) that don’t check the key.

language

"en-US" | "zh-CN"

default:"\"en-US\""

UI and system-prompt language. Controls the language the built-in panel and agent responses use.

maxSteps

number

default:"40"

Maximum number of Re-Act loop iterations per task. Increase for very long multi-step workflows; lower to cap LLM spend.

stepDelay

number

default:"0.4"

Seconds to wait between steps. Increase this if pages need extra time to settle after an action.

instructions.system

string

Global system-level instructions injected into every LLM prompt. Use this to describe your application, restrict agent scope, or define domain terminology.

Execute a task

Call agent.execute() with a natural-language task string. It returns a promise that resolves with an ExecutionResult when the task finishes — whether the agent calls done, hits maxSteps, or encounters an unrecoverable error. The promise only rejects for pre-flight failures (e.g. calling execute() while already running):

agent.ts

const result = await agent.execute('Click the login button')

console.log(result.success) // true | false
console.log(result.data)    // agent's final summary text
console.log(result.history) // full history of every step taken

You can also show the built-in floating panel and let the user type instructions directly:

agent.panel.show()

execute() throws synchronously if the agent has already been disposed, or if a task is already running (status is 'running'). Always check agent.status or await and catch accordingly before calling execute() in concurrent code.

Listen to agent events

PageAgent extends EventTarget. Subscribe to events for real-time UI feedback and debugging:

agent-events.ts

// Status transitions: 'idle' → 'running' → 'completed' | 'error' | 'stopped'
agent.addEventListener('statuschange', () => {
  console.log('Agent status:', agent.status)
})

// Transient activity events — ideal for driving a live status indicator
agent.addEventListener('activity', (e: Event) => {
  const activity = (e as CustomEvent).detail
  // activity.type: 'thinking' | 'executing' | 'executed' | 'retrying' | 'error'
  if (activity.type === 'executing') {
    console.log(`Executing tool: ${activity.tool}`, activity.input)
  }
})

// History events — persistent, forms the agent's memory
agent.addEventListener('historychange', () => {
  console.log('History updated:', agent.history)
})

Event types at a glance:

Event	When it fires	Payload
`statuschange`	Agent status transitions	`agent.status`
`activity`	Real-time step activity (transient)	`AgentActivity` on `e.detail`
`historychange`	History array mutated	`agent.history`
`dispose`	Agent is cleaned up	—

Stop or dispose the agent

Use stop() to cancel a running task gracefully. Use dispose() when you are done with the agent instance entirely (e.g. on component unmount):

agent-lifecycle.ts

// Cancel the current task and wait for it to fully settle
await agent.stop()
console.log('Agent stopped. Status:', agent.status) // 'stopped'

// Tear down the agent and its DOM overlay
agent.dispose()
// After dispose(), calling execute() will throw.

Never await agent.stop() inside a lifecycle hook (onBeforeStep, onAfterStep, etc.) — that would cause a deadlock. Call stop() from outside the agent’s own execution context.

Production: Securing Your API Key

Passing apiKey directly in frontend code means the key is visible in your JavaScript bundle and network requests. For any production deployment, proxy the LLM call through your own backend and use customFetch to intercept requests:

Never expose a real LLM API key in client-side code. Anyone who opens DevTools can read it, copy it, and run up charges on your account.

secure-agent.ts

import { PageAgent } from 'page-agent'

const agent = new PageAgent({
  model: 'gpt-4o',
  baseURL: '/api/llm-proxy', // your backend endpoint
  apiKey: 'not-used',        // placeholder — your proxy validates the session
  language: 'en-US',

  // Intercept every LLM fetch and attach a session token instead
  customFetch: async (url, init) => {
    const headers = new Headers(init?.headers)
    headers.set('X-Session-Token', getSessionToken()) // your auth mechanism
    headers.delete('Authorization')                   // remove the placeholder key
    return fetch(url, { ...init, headers })
  },
})

Your backend proxy receives the session token, validates the user, appends the real API key, and forwards the request to the LLM provider. This pattern keeps credentials server-side at all times.

Get Started

Features

Advanced

Get Started with Page Agent in Minutes

Configuration fields

Production: Securing Your API Key

Next Steps

Supported Models

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Features

Advanced

Documentation Index

​Configuration fields

​Production: Securing Your API Key

​Next Steps

Supported Models

Troubleshooting

Build docs developers (and LLMs) love

Configuration fields

Production: Securing Your API Key

Next Steps