AIAgent Python API Reference

The AIAgent class in run_agent.py is the core of OpenGauss — a synchronous conversation loop that calls an LLM, dispatches tool calls, and keeps iterating until the model returns a plain response. You can import it directly and drive it programmatically without the interactive CLI.

Installation

PyPI
From source
Full extras

pip install gauss-agent

git clone https://github.com/math-inc/OpenGauss
cd OpenGauss
pip install -e .

# All optional dependencies (messaging, speech, image, MCP, …)
pip install "gauss-agent[all]"

Entry Points

Installing the package registers three CLI entry points defined in pyproject.toml:

Command	Module	Purpose
`gauss`	`gauss_cli.main:main`	Interactive CLI (TUI, slash commands, skin engine)
`gauss-agent`	`run_agent:main`	Headless agent runner — single prompt or Fire CLI
`gauss-acp`	`acp_adapter.entry:main`	ACP server for VS Code, Zed, and JetBrains integration

Quick Start

Here is the minimal pattern for programmatic use. Additional examples follow the full parameter reference below.

from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    max_iterations=10,
    enabled_toolsets=["file", "terminal"],
    quiet_mode=True,
)
response = agent.chat("List the files in the current directory")
print(response)

Prompt caching will be invalidated if you change toolsets or rebuild system prompts mid-conversation. OpenGauss caches the conversation prefix using Anthropic’s prompt caching (auto-enabled for Claude models via OpenRouter). Do not alter enabled_toolsets or disabled_toolsets after the first API call, and do not reload memories or reconstruct the system prompt mid-session. Cache-breaking causes dramatically higher per-token costs. The only legitimate context alteration is an automatic context compression event.

Constructor: `AIAgent.init`

from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    max_iterations=90,
    quiet_mode=True,
)

Core Parameters

model

str

default:"\"anthropic/claude-opus-4.6\""

Model identifier in OpenRouter format (e.g. "anthropic/claude-opus-4.6", "openai/gpt-4o"). Pass a bare model name when using a direct provider via base_url.

max_iterations

int

default:"90"

Maximum number of LLM API calls (tool-calling iterations) for the entire session. This budget is shared across the parent agent and any spawned subagents via an IterationBudget instance. Programmatic tool calls via execute_code are refunded and do not count against this limit.

enabled_toolsets

list[str]

default:"None"

Allowlist of toolset names to load. When set, only tools belonging to these toolsets are available. Examples: ["file", "terminal"], ["web", "browser"]. If None, all toolsets are enabled (subject to disabled_toolsets).

disabled_toolsets

list[str]

default:"None"

Denylist of toolset names to exclude. Applied after enabled_toolsets. Use this to strip a single toolset from an otherwise full configuration, e.g. disabled_toolsets=["browser"].

quiet_mode

bool

default:"false"

Suppress all initialization banners, spinner output, and tool-progress prints. Set this to True for programmatic or batch use where stdout cleanliness matters. Log levels for internal modules are also raised to ERROR when quiet mode is active.

save_trajectories

bool

default:"false"

When True, each completed conversation is serialized to a JSONL trajectory file in the standard from/value format used for LLM fine-tuning. The batch_runner.py handles trajectory saving itself and always passes save_trajectories=False to avoid double-writing.

platform

str

default:"None"

The interface platform the user is on. Used to inject platform-specific formatting hints into the system prompt. Recognized values include "cli", "telegram", "discord", "whatsapp", and "slack".

session_id

str

default:"None"

Pre-generated session identifier used for log filenames and the SQLite session store. Auto-generated as YYYYMMDD_HHMMSS_<6-char-hex> when not provided. Pass an existing session ID to resume a conversation in the same log file.

skip_context_files

bool

default:"false"

When True, skips auto-injection of SOUL.md, AGENTS.md, and .cursorrules into the system prompt. Set this to True for batch processing and data generation to prevent user-specific persona files from polluting trajectories.

skip_memory

bool

default:"false"

When True, disables loading of persistent memory (MEMORY.md, USER.md) into the system prompt. Recommended for batch runs where cross-session memory is undesirable.

Provider and API Mode Parameters

base_url

str

default:"None"

Base URL for the LLM API. Defaults to https://openrouter.ai/api/v1 when not provided. Override to target a local model server, direct Anthropic, or direct OpenAI.

api_key

str

default:"None"

API key for authentication. When omitted, the agent resolves credentials from environment variables (e.g. OPENROUTER_API_KEY, ANTHROPIC_API_KEY) via the provider router. Explicit keys take precedence.

provider

str

default:"None"

Provider identifier hint used for routing decisions. Common values: "openrouter", "anthropic", "openai", "openai-codex". When None, the provider is inferred from base_url.

api_mode

str

default:"None"

API protocol override. Accepted values: "chat_completions" (default for OpenRouter/OpenAI-compatible), "anthropic_messages" (Anthropic native API), "codex_responses" (OpenAI Codex Responses API). Inferred automatically from base_url when not set.

OpenRouter Routing Parameters

providers_allowed

list[str]

default:"None"

Allowlist of OpenRouter provider backends (e.g. ["anthropic", "google"]). Requests will only be routed to these providers.

providers_ignored

list[str]

default:"None"

Denylist of OpenRouter provider backends to exclude (e.g. ["together", "deepinfra"]).

providers_order

list[str]

default:"None"

Ordered list of OpenRouter providers to try in sequence (e.g. ["anthropic", "openai", "google"]).

provider_sort

str

default:"None"

Sort OpenRouter providers dynamically by "price", "throughput", or "latency".

provider_require_parameters

bool

default:"false"

When True, only allow OpenRouter providers that support every requested parameter (e.g. max_tokens, reasoning).

provider_data_collection

str

default:"None"

OpenRouter data collection policy. Pass "deny" to opt out of provider data collection on platforms that support it.

Callback Parameters

tool_progress_callback

callable

default:"None"

Called with (tool_name: str, args_preview: str) just before each tool invocation. Use this to surface real-time tool progress in a custom UI.

thinking_callback

callable

default:"None"

Called with the raw thinking/reasoning content string as the model streams it.

reasoning_callback

callable

default:"None"

Alternative reasoning-content callback. Receives structured reasoning data from providers that expose it (e.g. DeepSeek, Qwen).

clarify_callback

callable

default:"None"

Called with (question: str, choices: list) -> str when the agent invokes the clarify tool to ask the user a question. Must return the user’s answer as a string. If None, the clarify tool returns an error message instead of blocking.

step_callback

callable

default:"None"

Called after each completed tool-calling iteration. Useful for progress tracking in long-running tasks.

Generation and Context Parameters

max_tokens

int

default:"None"

Maximum tokens in each model response. Defaults to the model’s native limit when not set. Automatically mapped to max_completion_tokens for direct OpenAI API targets.

reasoning_config

dict

default:"None"

OpenRouter reasoning configuration. Defaults to {"enabled": True, "effort": "medium"} for Claude models. Set {"effort": "none"} to disable chain-of-thought. Set {"effort": "high"} for deeper reasoning on complex tasks.

prefill_messages

list[dict]

default:"None"

Messages prepended to every conversation as prefilled context. Each item is an OpenAI-format message dict: {"role": "user" | "assistant", "content": "..."}. Use for few-shot priming or persistent persona injection.

ephemeral_system_prompt

str

default:"None"

A system prompt that is used during agent execution but is not saved to trajectories or session logs. Useful for injecting ephemeral task context that should not appear in training data.

Miscellaneous Parameters

tool_delay

float

default:"1.0"

Delay in seconds between consecutive tool calls. Increase to avoid rate limits on external APIs; set to 0.0 for maximum throughput in offline batch runs.

verbose_logging

bool

default:"false"

Enable DEBUG-level logging for the agent internals. Suppresses third-party library noise (OpenAI SDK, httpx) but surfaces model call details and tool dispatch traces.

log_prefix

str

default:"\"\""

Prefix string prepended to all console log messages. Useful in parallel batch runs to identify which worker produced a log line (e.g. "[B2:P17]").

log_prefix_chars

int

default:"100"

Maximum characters shown in log previews for tool arguments and responses.

session_db

object

default:"None"

An optional SessionDB instance (SQLite-backed, from gauss_state.py). When provided, all conversation messages are persisted for later retrieval via gauss session commands.

iteration_budget

IterationBudget

default:"None"

A pre-existing IterationBudget instance to share between a parent agent and its subagents. When None, a new budget is created from max_iterations. Pass the parent’s budget to child agents so all subagents consume from the same pool.

fallback_model

dict

default:"None"

A single backup model tried when the primary is unavailable (rate limit, overload, connection failure). Format: {"provider": "openrouter", "model": "anthropic/claude-sonnet-4"}.

checkpoints_enabled

bool

default:"false"

Enable filesystem checkpointing — automatic snapshots of the working directory after destructive operations (file writes, deletes). Stored transparently; not exposed as a tool.

checkpoint_max_snapshots

int

default:"50"

Maximum number of filesystem snapshots to retain when checkpoints_enabled=True. Older snapshots are pruned automatically once this limit is reached.

pass_session_id

bool

default:"false"

When True, injects the session_id into the system prompt so the model is aware of its own session identifier. Useful for multi-session workflows where the model needs to reference itself.

Methods

`chat(message)`

The simplest way to run the agent — pass a single message string and get a response string back.

def chat(self, message: str, stream_callback: Optional[callable] = None) -> str:
    """Simple interface — returns final response string."""

Parameters:

Parameter	Type	Description
`message`	`str`	The user’s input message.
`stream_callback`	`callable`	Optional callback invoked with each text delta during streaming. Used by TTS pipelines to begin audio generation before the full response arrives.

Example:

from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    quiet_mode=True,
    enabled_toolsets=["file"],
)

response = agent.chat("How many Python files are in the current directory?")
print(response)
# → "There are 12 Python files in the current directory."

chat() is a thin wrapper around run_conversation() — it calls it with no system message or history and returns result["final_response"].

`run_conversation(user_message, ...)`

The full conversation interface. Returns a rich result dictionary that includes the final response, the complete message history, and metadata.

def run_conversation(
    self,
    user_message: str,
    system_message: str = None,
    conversation_history: list = None,
    task_id: str = None,
    stream_callback: Optional[callable] = None,
    persist_user_message: Optional[str] = None,
) -> dict:
    """Full interface — returns dict with final_response + messages."""

Parameters:

Parameter	Type	Description
`user_message`	`str`	The user’s input to process.
`system_message`	`str`	Optional system prompt appended to the agent’s default identity prompt. Cached for the session lifetime — do not change between turns.
`conversation_history`	`list`	Prior messages in OpenAI format to prepend. Enables multi-turn conversations across separate `run_conversation` calls.
`task_id`	`str`	Unique identifier for this task. Isolates VM/browser resources per task. Required for parallel batch processing.
`stream_callback`	`callable`	Optional callback invoked with each text delta during streaming. Used by TTS pipelines to start audio generation before the full response arrives. When `None`, the standard non-streaming path is used.
`persist_user_message`	`str`	Optional clean user message to store in transcripts and history when `user_message` contains API-only synthetic prefixes that should not be persisted.

Return value:

{
    "final_response": str,          # The model's last text response
    "last_reasoning": str | None,   # Reasoning content from the last assistant turn (if any)
    "messages": list,               # Full OpenAI-format message history
    "completed": bool,              # True if loop ended cleanly (no tool calls remain)
    "partial": bool,                # True if stopped due to invalid tool calls
    "api_calls": int,               # Number of LLM API calls made
    "interrupted": bool,            # True if an interrupt was requested during the run
    "response_previewed": bool,     # True if the response was streamed/previewed before return
    # "interrupt_message": str      # Only present when interrupted=True and a message triggered it
}

Multi-turn example:

from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    quiet_mode=True,
    enabled_toolsets=["file", "terminal"],
)

# First turn
result1 = agent.run_conversation("Create a file called hello.txt with 'Hello, World!'")
print(result1["final_response"])

# Continue the conversation with the previous history
result2 = agent.run_conversation(
    "Now read that file back and tell me its contents.",
    conversation_history=result1["messages"],
)
print(result2["final_response"])

With a custom system prompt:

result = agent.run_conversation(
    user_message="Analyze the security of this codebase.",
    system_message="You are a security auditor. Focus on authentication flows and input validation.",
    task_id="sec-audit-001",
)

Agent Loop Internals

The conversation loop in run_conversation() is entirely synchronous. At its core:

while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tool_schemas,
    )
    if response.tool_calls:
        for tool_call in response.tool_calls:
            result = handle_function_call(tool_call.name, tool_call.args, task_id)
            messages.append(tool_result_message(result))
        api_call_count += 1
    else:
        return response.content  # No more tool calls — done

Key behaviors:

Tool dispatch is handled by handle_function_call() from model_tools.py, which routes to tools/registry.py and then to the specific tool implementation.
Budget pressure — the agent injects warning text into tool results at 70% and 90% of max_iterations to prompt the model to wrap up.
Context compression — when the conversation approaches the model’s context window limit, ContextCompressor summarizes older messages automatically and rebuilds the system prompt.
Prompt caching — for Claude models via OpenRouter and native Anthropic, apply_anthropic_cache_control() marks the system prompt and recent messages for caching, reducing input costs by ~75% on long conversations.

Message Format

All messages follow the OpenAI chat completions format:

# User message
{"role": "user", "content": "your message"}

# Assistant message with optional tool calls
{"role": "assistant", "content": "...", "tool_calls": [...], "reasoning": "..."}

# Tool result
{"role": "tool", "tool_call_id": "call_abc123", "content": "{...json...}"}

# System message
{"role": "system", "content": "..."}

Reasoning content from native thinking tokens is stored in assistant_msg["reasoning"], not embedded in content.

Dependency Chain

tools/registry.py      ← no external deps; imported by all tool files
       ↑
tools/*.py             ← each calls registry.register() at import time
       ↑
model_tools.py         ← imports tools/registry, triggers tool discovery
       ↑
run_agent.py           ← imports model_tools; defines AIAgent

When you from run_agent import AIAgent, the full tool registry is initialized. get_tool_definitions() is called in __init__ with your enabled_toolsets / disabled_toolsets filter applied.

ACP Server Mode

The gauss-acp entry point starts an Agent Communication Protocol (ACP) server that exposes the AIAgent over a local socket. This enables deep IDE integration:

gauss acp          # Start the ACP server (used by VS Code, Zed, JetBrains extensions)

The ACP adapter lives in acp_adapter/entry.py and wraps AIAgent with the agent-client-protocol transport. Install the optional dependency with:

pip install "gauss-agent[acp]"

CLI Reference

Toolsets & Skills

API & Embedding

AIAgent Python API Reference

Installation

Entry Points

Quick Start

Constructor: `AIAgent.init`

Core Parameters

Provider and API Mode Parameters

OpenRouter Routing Parameters

Callback Parameters

Generation and Context Parameters

Miscellaneous Parameters

Methods

`chat(message)`

`run_conversation(user_message, ...)`

Agent Loop Internals

Message Format

Dependency Chain

ACP Server Mode

Build docs developers (and LLMs) love

CLI Reference

Toolsets & Skills

API & Embedding

Documentation Index

​Installation

​Entry Points

​Quick Start

​Constructor: AIAgent.__init__

​Core Parameters

​Provider and API Mode Parameters

​OpenRouter Routing Parameters

​Callback Parameters

​Generation and Context Parameters

​Miscellaneous Parameters

​Methods

​chat(message)

​run_conversation(user_message, ...)

​Agent Loop Internals

​Message Format

​Dependency Chain

​ACP Server Mode

Build docs developers (and LLMs) love

Installation

Entry Points

Quick Start

Constructor: `AIAgent.init`

Core Parameters

Provider and API Mode Parameters

OpenRouter Routing Parameters

Callback Parameters

Generation and Context Parameters

Miscellaneous Parameters

Methods

`chat(message)`

`run_conversation(user_message, ...)`

Agent Loop Internals

Message Format

Dependency Chain

ACP Server Mode