NovaAct

Overview

The NovaAct class is the primary interface for building browser automation workflows with Amazon Nova Act. It manages browser sessions, executes natural language commands, and handles agent lifecycle.

Constructor

Create a new NovaAct client instance.

from nova_act import NovaAct

nova = NovaAct(
    starting_page="https://example.com",
    headless=False,
    screen_width=1600,
    screen_height=900
)

Parameters

starting_page

str | None

Starting web page for the browser window. Can be omitted if re-using an existing CDP page.

user_data_dir

str | None

Path to Chrome data storage (cookies, cache, etc.). If not specified, will use a temp dir.

If multiple NovaAct instances are used in the same process, each one must have its own user_data_dir.

clone_user_data_dir

bool

default:"True"

If True, will make a copy of user_data_dir into a temp dir for each instance. This ensures the original is not modified and that each instance has its own user data directory.

actuator

ManagedActuatorType | BrowserActuatorBase

default:"DefaultNovaLocalBrowserActuator"

Type or instance of a custom actuator. Deviations from NovaAct’s standard observation and I/O formats may impact model performance.

profile_directory

str | None

Name of the Chrome user profile. Only needed if using an existing, non-Default Chrome profile. Must be relative path within user_data_dir.

screen_width

int

default:"1600"

Width of the screen for the playwright instance. This sets the window size, while screenshots taken on the page will be slightly smaller viewport size.

Changing the default might impact agent performance.

screen_height

int

default:"900"

Height of the screen for the playwright instance. This sets the window size, while screenshots taken on the page will be slightly smaller viewport size.

Changing the default might impact agent performance.

ignore_screen_dims_check

bool

default:"False"

By default, NovaAct will fail to act if screen width/height outside of the acceptable range are provided. Pass this flag to warn instead.

headless

bool

default:"False"

Whether to launch the Playwright browser in headless mode. Can also be enabled with the NOVA_ACT_HEADLESS environment variable.

chrome_channel

str | None

Browser channel to use (e.g., “chromium”, “chrome-beta”, “msedge”). Defaults to “chrome”. Can also be specified via NOVA_ACT_CHROME_CHANNEL environment variable.

nova_act_api_key

str | None

API key for interacting with NovaAct. Will override the NOVA_ACT_API_KEY environment variable.

playwright_instance

Playwright | None

Add an existing Playwright instance for use.

tty

bool

default:"True"

By default, NovaAct listens for ctrl+x signals from the terminal, allowing users to exit agent action while keeping the browser session open (ctrl+c will kill the browser). The feature requires an additional listener thread, so this variable allows users to disable the feature where a tty is not available. NOVA_ACT_DISABLE_TTY environment variable takes precedence over this value.

cdp_endpoint_url

str | None

A Chrome DevTools Protocol (CDP) endpoint to connect to.

cdp_headers

dict[str, str] | None

Additional HTTP headers to be sent when connecting to a CDP endpoint.

cdp_use_existing_page

bool

default:"False"

If True, Nova Act will re-use an existing page from the CDP context rather than opening a new one.

user_agent

str | None

Optionally override the user agent used by playwright.

logs_directory

str | None

Output directory for video and agent run output. Will default to a temp dir.

record_video

bool

default:"False"

Whether to record video of the browser session.

go_to_url_timeout

int | None

Max wait time on initial page load in seconds.

ignore_https_errors

bool

default:"False"

If True, ignore certificate validation errors for https urls.

security_options

SecurityOptions | None

Set of security-related parameters that overwrite default agent behavior. See SecurityOptions for details.

state_guardrail

GuardrailCallable | None

A callback function that takes a GuardrailInputState and returns a GuardrailDecision. Called after taking an observation but before invoking step on the backend. If it returns GuardrailDecision.BLOCK, act() will raise ActGuardrailsError. See Guardrails for details.

stop_hooks

list[StopHook]

default:"[]"

A list of stop hooks that are called when this object is stopped.

use_default_chrome_browser

bool

default:"False"

Use the locally installed Chrome browser. Only works on MacOS.

proxy

dict[str, str] | None

Proxy configuration for the browser. Should contain server, username, and password keys.

human_input_callbacks

HumanInputCallbacksBase | None

An implementation of human input callbacks. If not provided, a request for human input tool will not be made.

tools

list[ActionType] | None

A list of client-provided tools. Use the @tool decorator to create tools. See Tools for details.

workflow

Workflow | None

A Workflow instance to associate with this NovaAct session. See Workflow for details.

Methods

start()

Start the NovaAct client and launch the browser.

nova = NovaAct(starting_page="https://example.com")
nova.start()

If using as a context manager (with NovaAct(...) as nova:), start() is called automatically.

act()

Execute a natural language command in the browser.

result = nova.act("Click the login button")

Parameters

prompt

str

required

The natural language task to actuate on the web browser.

timeout

int | None

The timeout (in seconds) for the task to actuate.

max_steps

int | None

default:"30"

Configure the maximum number of steps (browser actuations) act() will take before giving up on the task. Use this to make sure the agent doesn’t get stuck forever trying different paths.

model_temperature

float | None

Temperature parameter for model generation.

model_top_k

int | None

Top-k parameter for model generation.

model_seed

int | None

Seed for reproducible model generation.

observation_delay_ms

int | None

Additional delay in milliseconds before taking an observation of the page.

schema

Mapping[str, JsonValue] | None

deprecated

An optional jsonschema for the output to adhere to.

Deprecated: Use act_get() instead for structured responses.

Returns

result

ActResult

Contains metadata about the execution. See ActResult for details.

Raises

ActError - Base class for all act execution errors
ValidationFailed - Invalid input parameters
ClientNotStarted - Client was not started before calling act()

Example

from nova_act import NovaAct

with NovaAct(starting_page="https://example.com") as nova:
    result = nova.act(
        "Find flights from Boston to Seattle",
        timeout=120,
        max_steps=50
    )
    print(f"Completed in {result.metadata.num_steps_executed} steps")

act_get()

Execute a natural language command and return structured data.

result = nova.act_get(
    "How many search results are on this page?",
    schema={"type": "integer"}
)
print(result.parsed_response)  # Returns an integer

Parameters

prompt

str

required

The natural language task to actuate on the web browser.

schema

Mapping[str, JsonValue]

default:"STRING_SCHEMA"

A jsonschema that the output should adhere to. Defaults to {"type": "string"} when not specified.

timeout

int | None

The timeout (in seconds) for the task to actuate.

max_steps

int | None

default:"30"

Configure the maximum number of steps (browser actuations) before giving up.

model_temperature

float | None

Temperature parameter for model generation.

model_top_k

int | None

Top-k parameter for model generation.

model_seed

int | None

Seed for reproducible model generation.

observation_delay_ms

int | None

Additional delay in milliseconds before taking an observation of the page.

Returns

result

ActGetResult

Contains the structured response and metadata. See ActGetResult for details.

Raises

ActError - Base class for all act execution errors
ActInvalidModelGenerationError - Result did not match expected schema
ValidationFailed - Invalid input parameters
ClientNotStarted - Client was not started before calling act_get()

Example

from nova_act import NovaAct, STRING_SCHEMA

with NovaAct(starting_page="https://example.com") as nova:
    # Get a string response
    result = nova.act_get(
        "What is the title of this page?",
        schema=STRING_SCHEMA
    )
    print(result.parsed_response)
    
    # Get structured data
    result = nova.act_get(
        "How many products are on this page?",
        schema={"type": "integer"}
    )
    count = result.parsed_response

Use act_get() when you need to extract information from the browser. Use act() when you only need to perform actions without extracting data.

stop()

Stop the NovaAct client and close the browser.

nova.stop()

If using as a context manager, stop() is called automatically when exiting the context.

go_to_url()

Navigate to a specific URL and wait for the page to settle.

nova.go_to_url("https://example.com")

Parameters

url

str

required

The URL to navigate to.

Raises

ClientNotStarted - Client was not started before calling go_to_url()
ValidationFailed - Invalid URL or URL blocked by security options

get_page()

Get a particular playwright page by index or the currently actuating page.

page = nova.get_page(0)  # Get first page
page = nova.get_page()    # Get current page (index=-1)

Parameters

index

int

default:"-1"

Index of the page to retrieve. Use -1 for the currently actuating page.

Returns

page

playwright.Page

The Playwright Page object.

Raises

ClientNotStarted - Client was not started
ValidationFailed - Actuator is not of type PlaywrightPageManagerBase

Only available if the provided actuator is of type PlaywrightPageManagerBase. The order of pages might not reflect their tab order if they have been moved.

register_stop_hook()

def cleanup_hook(nova: NovaAct) -> None:
    print("Cleaning up...")

nova.register_stop_hook(cleanup_hook)

Parameters

hook

StopHook

required

The stop hook to register. Must implement the StopHook protocol.

unregister_stop_hook()

Unregister a previously registered stop hook.

nova.unregister_stop_hook(cleanup_hook)

Parameters

hook

StopHook

required

The stop hook to unregister.

Properties

started

Check if the client has been started.

if nova.started:
    print("Client is running")

started

bool

Returns True if the actuator is started and session ID is set.

page

Get the current playwright page.

page = nova.page
page.screenshot(path="screenshot.png")

page

playwright.Page

The Playwright Page on which the SDK is currently actuating.

To get a specific page, use nova.pages to list all pages, then fetch with nova.get_page(i).

pages

Get all playwright pages.

all_pages = nova.pages
print(f"Total pages: {len(all_pages)}")

pages

list[playwright.Page]

List of all Playwright Page objects in the browser context.

The order might not reflect tab order in the window if pages have been moved.

dispatcher

Get the ActDispatcher for actuation.

dispatcher = nova.dispatcher

dispatcher

ActDispatcher

The dispatcher instance for sending act prompts to the browser.

Context Manager

The NovaAct class can be used as a context manager for automatic lifecycle management.

with NovaAct(starting_page="https://example.com") as nova:
    nova.act("Click the login button")
# Browser is automatically closed when exiting the context

Complete Example

from nova_act import NovaAct, SecurityOptions

# Initialize with security options
security = SecurityOptions(
    allowed_file_upload_paths=["/home/user/documents/*"],
    allowed_file_open_paths=["/home/user/files/*"]
)

with NovaAct(
    starting_page="https://example.com",
    headless=False,
    screen_width=1920,
    screen_height=1080,
    security_options=security,
    record_video=True
) as nova:
    # Execute commands
    nova.act("Search for flights from Boston to Seattle")
    
    # Extract structured data
    result = nova.act_get(
        "What is the price of the cheapest flight?",
        schema={"type": "number"}
    )
    
    print(f"Cheapest flight: ${result.parsed_response}")
    print(f"Steps executed: {result.metadata.num_steps_executed}")
    print(f"Time worked: {result.metadata.time_worked_s}s")

Core Classes

Configuration

Error Handling

Utilities

Overview

Constructor

Parameters

Methods

start()

act()

Parameters

Returns

Raises

Example

act_get()

Parameters

Returns

Raises

Example

stop()

go_to_url()

Parameters

Raises

get_page()

Parameters

Returns

Raises

register_stop_hook()

Parameters

unregister_stop_hook()

Parameters

Properties

started

page

pages

dispatcher

Context Manager

Complete Example

Build docs developers (and LLMs) love

Core Classes

Configuration

Error Handling

Utilities

​Overview

​Constructor

​Parameters

​Methods

​start()

​act()

​Parameters

​Returns

​Raises

​Example

​act_get()

​Parameters

​Returns

​Raises

​Example

​stop()

​go_to_url()

​Parameters

​Raises

​get_page()

​Parameters

​Returns

​Raises

​register_stop_hook()

​Parameters

​unregister_stop_hook()

​Parameters

​Properties

​started

​page

​pages

​dispatcher

​Context Manager

​Complete Example

Build docs developers (and LLMs) love

Overview

Constructor

Parameters

Methods

start()

act()

Parameters

Returns

Raises

Example

act_get()

Parameters

Returns

Raises

Example

stop()

go_to_url()

Parameters

Raises

get_page()

Parameters

Returns

Raises

register_stop_hook()

Parameters

unregister_stop_hook()

Parameters

Properties

started

page

pages

dispatcher

Context Manager

Complete Example