Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/l-xiaoshen/handstage/llms.txt

Use this file to discover all available pages before exploring further.

HandstagesAgentToolHandlers is the interface you implement to connect the tool definitions in handstagesAgentTools to an actual running browser. Each method in the interface receives a typed input object (inferred from the tool’s Zod schema) and returns a typed output object. You pass your implementation’s methods as the execute function on each tool when setting up the Vercel AI SDK.

Supporting types

HandstagesAgentContext

HandstagesAgentContext represents the browser context that your handler implementation typically holds onto. It maps directly to the context object you get from V3.connectLocal().
import type { HandstagesAgentContext } from "@handstage/agent"

interface HandstagesAgentContext {
  pages(): Page[]
  activePage(): Page | undefined
  setActivePage(page: Page): void
  newPage(url?: string): Promise<Page>
}
MethodDescription
pages()Returns all currently open Page instances in the context.
activePage()Returns the foreground Page, or undefined if none is active.
setActivePage(page)Brings the given Page to the foreground.
newPage(url?)Opens a new Page, optionally navigating to url immediately.

HandstagesAgent namespace

The HandstagesAgent namespace re-exports typed input and output types for every tool, inferred directly from the Zod schemas in handstagesAgentTools. Use these types to annotate your handler methods without manually writing interface types.
import type { HandstagesAgent } from "@handstage/agent"
TypeDescription
HandstagesAgent.ToolNameUnion of all 17 tool name strings.
HandstagesAgent.PagesInputInput for pages{}
HandstagesAgent.PagesOutputOutput for pages{ pages: PageEntry[] }
HandstagesAgent.PageEntrySingle entry in PagesOutput.pages
HandstagesAgent.NewPageInputInput for newPage{ url?: string }
HandstagesAgent.NewPageOutputOutput for newPage{ pageId: string }
HandstagesAgent.SetActivePageInputInput for setActivePage{ pageId: string }
HandstagesAgent.SetActivePageOutputOutput for setActivePage{ ok: true } | { ok: false; error: string }
HandstagesAgent.GotoInputInput for goto
HandstagesAgent.GotoOutputOutput for goto
HandstagesAgent.ReloadInputInput for reload
HandstagesAgent.ReloadOutputOutput for reload
HandstagesAgent.GoBackInputInput for goBack
HandstagesAgent.GoBackOutputOutput for goBack
HandstagesAgent.GoForwardInputInput for goForward
HandstagesAgent.GoForwardOutputOutput for goForward
HandstagesAgent.SnapshotInputInput for snapshot
HandstagesAgent.SnapshotOutputOutput for snapshot
HandstagesAgent.PageInfoInputInput for pageInfo
HandstagesAgent.PageInfoOutputOutput for pageInfo
HandstagesAgent.ClickInputInput for click
HandstagesAgent.ClickOutputOutput for click
HandstagesAgent.HoverInputInput for hover
HandstagesAgent.HoverOutputOutput for hover
HandstagesAgent.ScrollInputInput for scroll
HandstagesAgent.ScrollOutputOutput for scroll
HandstagesAgent.TypeInputInput for type
HandstagesAgent.TypeOutputOutput for type
HandstagesAgent.ClickOnInputInput for click_on
HandstagesAgent.ClickOnOutputOutput for click_on
HandstagesAgent.FillOnInputInput for fill_on
HandstagesAgent.FillOnOutputOutput for fill_on
HandstagesAgent.TypeOnInputInput for type_on
HandstagesAgent.TypeOnOutputOutput for type_on
HandstagesAgent.HoverOnInputInput for hover_on
HandstagesAgent.HoverOnOutputOutput for hover_on
HandstagesAgent.OkResult{ ok: true }
HandstagesAgent.ErrResult{ ok: false; error: string }

HandstagesAgentToolHandlers interface

Each method in HandstagesAgentToolHandlers corresponds to one of the 17 tools in handstagesAgentTools. Your implementation is responsible for translating tool inputs into Handstage browser operations and returning the correct output shape.
import type { HandstagesAgentToolHandlers } from "@handstage/agent"
Returns all open tabs in the browser context.Signature
pages(input: HandstagesAgent.PagesInput): Promise<HandstagesAgent.PagesOutput>
Input
input
{}
No fields required.
Output
pages
object[]
required
Array of tab entries, each with pageId, url, title, and activated.
Opens a new browser tab, optionally navigating to a URL.Signature
newPage(input: HandstagesAgent.NewPageInput): Promise<HandstagesAgent.NewPageOutput>
Input
url
string
Optional starting URL. Defaults to "about:blank".
Output
pageId
string
required
Unique identifier for the new tab.
Brings a tab to the foreground.Signature
setActivePage(input: HandstagesAgent.SetActivePageInput): Promise<HandstagesAgent.SetActivePageOutput>
Input
pageId
string
required
ID of the tab to focus.
Output
ok
true | false
required
true on success; false on failure.
error
string
Error message when ok is false.
Navigates a tab to a URL.Signature
goto(input: HandstagesAgent.GotoInput): Promise<HandstagesAgent.GotoOutput>
Input
pageId
string
required
Target tab ID.
url
string
required
Destination URL.
waitUntil
"load" | "domcontentloaded" | "networkidle"
Lifecycle event to await.
timeoutMs
number
Navigation timeout in milliseconds.
Output
ok
true | false
required
true on success; false on failure.
url
string
Final URL after navigation. Only present when ok is true.
error
string
Error message when ok is false.
Reloads the current document in a tab.Signature
reload(input: HandstagesAgent.ReloadInput): Promise<HandstagesAgent.ReloadOutput>
Input
pageId
string
required
Target tab ID.
waitUntil
"load" | "domcontentloaded" | "networkidle"
Lifecycle event to await.
timeoutMs
number
Reload timeout in milliseconds.
ignoreCache
boolean
Pass true to bypass the browser cache.
Output
ok
true | false
required
true on success; false on failure.
url
string
URL after reload. Only present when ok is true.
error
string
Error message when ok is false.
Goes back one step in a tab’s session history.Signature
goBack(input: HandstagesAgent.GoBackInput): Promise<HandstagesAgent.GoBackOutput>
Input
pageId
string
required
Target tab ID.
waitUntil
"load" | "domcontentloaded" | "networkidle"
Lifecycle event to await after navigation.
timeoutMs
number
Timeout in milliseconds.
Output
ok
true | false
required
true on success; false on failure.
navigated
boolean
Whether the tab actually navigated back. Only present when ok is true.
url
string
Current URL after the operation. Only present when ok is true.
error
string
Error message when ok is false.
Goes forward one step in a tab’s session history.Signature
goForward(input: HandstagesAgent.GoForwardInput): Promise<HandstagesAgent.GoForwardOutput>
Input
pageId
string
required
Target tab ID.
waitUntil
"load" | "domcontentloaded" | "networkidle"
Lifecycle event to await after navigation.
timeoutMs
number
Timeout in milliseconds.
Output
ok
true | false
required
true on success; false on failure.
navigated
boolean
Whether the tab actually navigated forward. Only present when ok is true.
url
string
Current URL after the operation. Only present when ok is true.
error
string
Error message when ok is false.
Captures the accessibility tree for a tab.Signature
snapshot(input: HandstagesAgent.SnapshotInput): Promise<HandstagesAgent.SnapshotOutput>
Input
pageId
string
required
Target tab ID.
includeIframes
boolean
Whether to include nodes from embedded iframes.
Output
ok
true | false
required
true on success; false on failure.
tree
string
Accessibility tree as multiline text. Only present when ok is true.
xpathMap
Record<string, string>
Maps encoded node IDs to XPath selectors. Only present when ok is true.
urlMap
Record<string, string>
Maps encoded node IDs to link href values. Only present when ok is true.
error
string
Error message when ok is false.
Returns the current URL and document title for a tab.Signature
pageInfo(input: HandstagesAgent.PageInfoInput): Promise<HandstagesAgent.PageInfoOutput>
Input
pageId
string
required
Target tab ID.
Output
ok
true | false
required
true on success; false on failure.
url
string
Current URL. Only present when ok is true.
title
string
Document title. Only present when ok is true.
error
string
Error message when ok is false.
Dispatches a mouse click at viewport coordinates.Signature
click(input: HandstagesAgent.ClickInput): Promise<HandstagesAgent.ClickOutput>
Input
pageId
string
required
Target tab ID.
x
number
required
Horizontal coordinate in CSS pixels.
y
number
required
Vertical coordinate in CSS pixels.
button
"left" | "right" | "middle"
Mouse button. Defaults to "left".
clickCount
number
Number of clicks. Positive integer.
Output
ok
true | false
required
true on success; false on failure.
xpathAtPoint
string
XPath of the element at the clicked point, if available. Only present when ok is true.
error
string
Error message when ok is false.
Moves the pointer to viewport coordinates.Signature
hover(input: HandstagesAgent.HoverInput): Promise<HandstagesAgent.HoverOutput>
Input
pageId
string
required
Target tab ID.
x
number
required
Horizontal coordinate in CSS pixels.
y
number
required
Vertical coordinate in CSS pixels.
Output
ok
true | false
required
true on success; false on failure.
xpathAtPoint
string
XPath of the element at the pointer position. Only present when ok is true.
error
string
Error message when ok is false.
Dispatches a mouse wheel event at viewport coordinates.Signature
scroll(input: HandstagesAgent.ScrollInput): Promise<HandstagesAgent.ScrollOutput>
Input
pageId
string
required
Target tab ID.
x
number
required
Horizontal coordinate of the wheel event in CSS pixels.
y
number
required
Vertical coordinate of the wheel event in CSS pixels.
deltaX
number
required
Horizontal scroll delta in pixels.
deltaY
number
required
Vertical scroll delta in pixels.
Output
ok
true | false
required
true on success; false on failure.
xpathAtPoint
string
XPath of the element at the scroll position. Only present when ok is true.
error
string
Error message when ok is false.
Types text at the currently focused element using key events.Signature
type(input: HandstagesAgent.TypeInput): Promise<HandstagesAgent.TypeOutput>
Input
pageId
string
required
Target tab ID.
text
string
required
Text to type.
delay
number
Milliseconds between keystrokes. Non-negative.
withMistakes
boolean
Simulate human-like typing with occasional errors and corrections.
Output
ok
true | false
required
true on success; false on failure.
error
string
Error message when ok is false.
Clicks the first element matching a CSS or XPath selector.Signature
click_on(input: HandstagesAgent.ClickOnInput): Promise<HandstagesAgent.ClickOnOutput>
Input
pageId
string
required
Target tab ID.
select
string
required
CSS selector or XPath expression (e.g., //button[@id='submit']).
Output
ok
true | false
required
true on success; false on failure.
error
string
Error message when ok is false.
Clears and fills an input element matched by a CSS or XPath selector.Signature
fill_on(input: HandstagesAgent.FillOnInput): Promise<HandstagesAgent.FillOnOutput>
Input
pageId
string
required
Target tab ID.
select
string
required
CSS selector or XPath expression targeting the input element.
value
string
required
New value to set.
Output
ok
true | false
required
true on success; false on failure.
error
string
Error message when ok is false.
Focuses an element by selector, then types text into it using key events.Signature
type_on(input: HandstagesAgent.TypeOnInput): Promise<HandstagesAgent.TypeOnOutput>
Input
pageId
string
required
Target tab ID.
select
string
required
CSS selector or XPath expression targeting the element.
text
string
required
Text to type.
delay
number
Milliseconds between keystrokes. Non-negative.
Output
ok
true | false
required
true on success; false on failure.
error
string
Error message when ok is false.
Moves the pointer to the first element matching a CSS or XPath selector.Signature
hover_on(input: HandstagesAgent.HoverOnInput): Promise<HandstagesAgent.HoverOnOutput>
Input
pageId
string
required
Target tab ID.
select
string
required
CSS selector or XPath expression targeting the element.
Output
ok
true | false
required
true on success; false on failure.
error
string
Error message when ok is false.

Example implementation

Below is a minimal but complete implementation of HandstagesAgentToolHandlers. It holds a reference to a HandstagesAgentContext and delegates each method to the underlying Handstage browser APIs.
import type {
  HandstagesAgent,
  HandstagesAgentContext,
  HandstagesAgentToolHandlers,
} from "@handstage/agent"
import type { Page } from "@handstage/core"

class MyBrowserHandlers implements HandstagesAgentToolHandlers {
  constructor(private ctx: HandstagesAgentContext) {}

  async pages(_input: HandstagesAgent.PagesInput): Promise<HandstagesAgent.PagesOutput> {
    return {
      pages: this.ctx.pages().map((p) => ({
        pageId: p.targetId,
        url: p.url(),
        title: p.title(),
        activated: p === this.ctx.activePage(),
      })),
    }
  }

  async newPage(input: HandstagesAgent.NewPageInput): Promise<HandstagesAgent.NewPageOutput> {
    const page = await this.ctx.newPage(input.url)
    return { pageId: page.targetId }
  }

  async setActivePage(
    input: HandstagesAgent.SetActivePageInput,
  ): Promise<HandstagesAgent.SetActivePageOutput> {
    const page = this.ctx.pages().find((p) => p.targetId === input.pageId)
    if (!page) return { ok: false, error: `No page with id ${input.pageId}` }
    this.ctx.setActivePage(page)
    return { ok: true }
  }

  async goto(input: HandstagesAgent.GotoInput): Promise<HandstagesAgent.GotoOutput> {
    const page = this.getPage(input.pageId)
    if (!page) return { ok: false, error: `No page with id ${input.pageId}` }
    try {
      await page.goto(input.url, {
        waitUntil: input.waitUntil,
        timeout: input.timeoutMs,
      })
      return { ok: true, url: page.url() }
    } catch (err) {
      return { ok: false, error: String(err) }
    }
  }

  // ... implement remaining 13 methods following the same pattern

  private getPage(pageId: string): Page | undefined {
    return this.ctx.pages().find((p) => p.targetId === pageId)
  }
}

Complete browser agent example

The following example shows how to wire everything together: create a Handstage browser, implement the handlers, attach execution to each tool, and run a browser automation task with the Vercel AI SDK.
import { generateText, tool } from "ai"
import { openai } from "@ai-sdk/openai"
import { V3 } from "@handstage/core"
import {
  handstagesAgentTools,
  type HandstagesAgent,
  type HandstagesAgentContext,
  type HandstagesAgentToolHandlers,
} from "@handstage/agent"

// Step 1 — connect to a local Chrome instance
const browser = await V3.connectLocal()
const ctx: HandstagesAgentContext = browser.context

// Step 2 — implement HandstagesAgentToolHandlers
class BrowserHandlers implements HandstagesAgentToolHandlers {
  constructor(private ctx: HandstagesAgentContext) {}

  async pages(_: HandstagesAgent.PagesInput): Promise<HandstagesAgent.PagesOutput> {
    return {
      pages: this.ctx.pages().map((p) => ({
        pageId: p.targetId,
        url: p.url(),
        title: p.title(),
        activated: p === this.ctx.activePage(),
      })),
    }
  }

  async newPage(input: HandstagesAgent.NewPageInput): Promise<HandstagesAgent.NewPageOutput> {
    const page = await this.ctx.newPage(input.url)
    return { pageId: page.targetId }
  }

  async goto(input: HandstagesAgent.GotoInput): Promise<HandstagesAgent.GotoOutput> {
    const page = this.ctx.pages().find((p) => p.targetId === input.pageId)
    if (!page) return { ok: false, error: "Page not found" }
    try {
      await page.goto(input.url, { waitUntil: input.waitUntil, timeout: input.timeoutMs })
      return { ok: true, url: page.url() }
    } catch (err) {
      return { ok: false, error: String(err) }
    }
  }

  async snapshot(input: HandstagesAgent.SnapshotInput): Promise<HandstagesAgent.SnapshotOutput> {
    const page = this.ctx.pages().find((p) => p.targetId === input.pageId)
    if (!page) return { ok: false, error: "Page not found" }
    try {
      const result = await page.snapshot({ includeIframes: input.includeIframes })
      return { ok: true, tree: result.tree, xpathMap: result.xpathMap, urlMap: result.urlMap }
    } catch (err) {
      return { ok: false, error: String(err) }
    }
  }

  // ... implement remaining methods
}

const handlers = new BrowserHandlers(ctx)

// Step 3 — attach execute functions to each tool
const executableTools = {
  ...handstagesAgentTools,
  pages: tool({ ...handstagesAgentTools.pages, execute: (i) => handlers.pages(i) }),
  newPage: tool({ ...handstagesAgentTools.newPage, execute: (i) => handlers.newPage(i) }),
  goto: tool({ ...handstagesAgentTools.goto, execute: (i) => handlers.goto(i) }),
  snapshot: tool({ ...handstagesAgentTools.snapshot, execute: (i) => handlers.snapshot(i) }),
  // ... attach remaining tools
}

// Step 4 — run the automation task
const result = await generateText({
  model: openai("gpt-4o"),
  tools: executableTools,
  maxSteps: 30,
  system: "You are a browser automation agent. Use the provided tools to complete tasks.",
  prompt: "Open https://example.com, read the page title, and return it.",
})

console.log(result.text)

await browser.close()
The execute functions are the only thing that separates handstagesAgentTools (schema only) from a fully runnable tool set. Keep your handler class separate from the AI SDK wiring so you can test each method in isolation.

Build docs developers (and LLMs) love