Overview
TheNovaAct class is the primary interface for building browser automation workflows with Amazon Nova Act. It manages browser sessions, executes natural language commands, and handles agent lifecycle.
Constructor
Create a new NovaAct client instance.Parameters
Starting web page for the browser window. Can be omitted if re-using an existing CDP page.
Path to Chrome data storage (cookies, cache, etc.). If not specified, will use a temp dir.
If multiple NovaAct instances are used in the same process, each one must have its own
user_data_dir.If True, will make a copy of
user_data_dir into a temp dir for each instance. This ensures the original is not modified and that each instance has its own user data directory.Type or instance of a custom actuator. Deviations from NovaAct’s standard observation and I/O formats may impact model performance.
Name of the Chrome user profile. Only needed if using an existing, non-Default Chrome profile. Must be relative path within
user_data_dir.Width of the screen for the playwright instance. This sets the window size, while screenshots taken on the page will be slightly smaller viewport size.
Height of the screen for the playwright instance. This sets the window size, while screenshots taken on the page will be slightly smaller viewport size.
By default, NovaAct will fail to act if screen width/height outside of the acceptable range are provided. Pass this flag to warn instead.
Whether to launch the Playwright browser in headless mode. Can also be enabled with the
NOVA_ACT_HEADLESS environment variable.Browser channel to use (e.g., “chromium”, “chrome-beta”, “msedge”). Defaults to “chrome”. Can also be specified via
NOVA_ACT_CHROME_CHANNEL environment variable.API key for interacting with NovaAct. Will override the
NOVA_ACT_API_KEY environment variable.Add an existing Playwright instance for use.
By default, NovaAct listens for ctrl+x signals from the terminal, allowing users to exit agent action while keeping the browser session open (ctrl+c will kill the browser). The feature requires an additional listener thread, so this variable allows users to disable the feature where a tty is not available.
NOVA_ACT_DISABLE_TTY environment variable takes precedence over this value.A Chrome DevTools Protocol (CDP) endpoint to connect to.
Additional HTTP headers to be sent when connecting to a CDP endpoint.
If True, Nova Act will re-use an existing page from the CDP context rather than opening a new one.
Optionally override the user agent used by playwright.
Output directory for video and agent run output. Will default to a temp dir.
Whether to record video of the browser session.
Max wait time on initial page load in seconds.
If True, ignore certificate validation errors for https urls.
Set of security-related parameters that overwrite default agent behavior. See SecurityOptions for details.
A callback function that takes a
GuardrailInputState and returns a GuardrailDecision. Called after taking an observation but before invoking step on the backend. If it returns GuardrailDecision.BLOCK, act() will raise ActGuardrailsError. See Guardrails for details.A list of stop hooks that are called when this object is stopped.
Use the locally installed Chrome browser. Only works on MacOS.
Proxy configuration for the browser. Should contain
server, username, and password keys.An implementation of human input callbacks. If not provided, a request for human input tool will not be made.
A list of client-provided tools. Use the
@tool decorator to create tools. See Tools for details.A Workflow instance to associate with this NovaAct session. See Workflow for details.
Methods
start()
Start the NovaAct client and launch the browser.If using as a context manager (
with NovaAct(...) as nova:), start() is called automatically.act()
Execute a natural language command in the browser.Parameters
The natural language task to actuate on the web browser.
The timeout (in seconds) for the task to actuate.
Configure the maximum number of steps (browser actuations)
act() will take before giving up on the task. Use this to make sure the agent doesn’t get stuck forever trying different paths.Temperature parameter for model generation.
Top-k parameter for model generation.
Seed for reproducible model generation.
Additional delay in milliseconds before taking an observation of the page.
An optional jsonschema for the output to adhere to.
Returns
Raises
ActError- Base class for all act execution errorsValidationFailed- Invalid input parametersClientNotStarted- Client was not started before calling act()
Example
act_get()
Execute a natural language command and return structured data.Parameters
The natural language task to actuate on the web browser.
A jsonschema that the output should adhere to. Defaults to
{"type": "string"} when not specified.The timeout (in seconds) for the task to actuate.
Configure the maximum number of steps (browser actuations) before giving up.
Temperature parameter for model generation.
Top-k parameter for model generation.
Seed for reproducible model generation.
Additional delay in milliseconds before taking an observation of the page.
Returns
Contains the structured response and metadata. See ActGetResult for details.
Raises
ActError- Base class for all act execution errorsActInvalidModelGenerationError- Result did not match expected schemaValidationFailed- Invalid input parametersClientNotStarted- Client was not started before calling act_get()
Example
Use
act_get() when you need to extract information from the browser. Use act() when you only need to perform actions without extracting data.stop()
Stop the NovaAct client and close the browser.If using as a context manager,
stop() is called automatically when exiting the context.go_to_url()
Navigate to a specific URL and wait for the page to settle.Parameters
The URL to navigate to.
Raises
ClientNotStarted- Client was not started before calling go_to_url()ValidationFailed- Invalid URL or URL blocked by security options
get_page()
Get a particular playwright page by index or the currently actuating page.Parameters
Index of the page to retrieve. Use -1 for the currently actuating page.
Returns
The Playwright Page object.
Raises
ClientNotStarted- Client was not startedValidationFailed- Actuator is not of type PlaywrightPageManagerBase
Only available if the provided actuator is of type
PlaywrightPageManagerBase. The order of pages might not reflect their tab order if they have been moved.register_stop_hook()
Register a stop hook that will be called duringstop().
Parameters
The stop hook to register. Must implement the StopHook protocol.
unregister_stop_hook()
Unregister a previously registered stop hook.Parameters
The stop hook to unregister.
Properties
started
Check if the client has been started.Returns True if the actuator is started and session ID is set.
page
Get the current playwright page.The Playwright Page on which the SDK is currently actuating.
To get a specific page, use
nova.pages to list all pages, then fetch with nova.get_page(i).pages
Get all playwright pages.List of all Playwright Page objects in the browser context.
The order might not reflect tab order in the window if pages have been moved.
dispatcher
Get the ActDispatcher for actuation.The dispatcher instance for sending act prompts to the browser.
Context Manager
TheNovaAct class can be used as a context manager for automatic lifecycle management.