Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/math-inc/OpenGauss/llms.txt

Use this file to discover all available pages before exploring further.

The AIAgent class in run_agent.py is the core of OpenGauss — a synchronous conversation loop that calls an LLM, dispatches tool calls, and keeps iterating until the model returns a plain response. You can import it directly and drive it programmatically without the interactive CLI.

Installation

pip install gauss-agent

Entry Points

Installing the package registers three CLI entry points defined in pyproject.toml:
CommandModulePurpose
gaussgauss_cli.main:mainInteractive CLI (TUI, slash commands, skin engine)
gauss-agentrun_agent:mainHeadless agent runner — single prompt or Fire CLI
gauss-acpacp_adapter.entry:mainACP server for VS Code, Zed, and JetBrains integration

Quick Start

Here is the minimal pattern for programmatic use. Additional examples follow the full parameter reference below.
from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    max_iterations=10,
    enabled_toolsets=["file", "terminal"],
    quiet_mode=True,
)
response = agent.chat("List the files in the current directory")
print(response)
Prompt caching will be invalidated if you change toolsets or rebuild system prompts mid-conversation. OpenGauss caches the conversation prefix using Anthropic’s prompt caching (auto-enabled for Claude models via OpenRouter). Do not alter enabled_toolsets or disabled_toolsets after the first API call, and do not reload memories or reconstruct the system prompt mid-session. Cache-breaking causes dramatically higher per-token costs. The only legitimate context alteration is an automatic context compression event.

Constructor: AIAgent.__init__

from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    max_iterations=90,
    quiet_mode=True,
)

Core Parameters

model
str
default:"\"anthropic/claude-opus-4.6\""
Model identifier in OpenRouter format (e.g. "anthropic/claude-opus-4.6", "openai/gpt-4o"). Pass a bare model name when using a direct provider via base_url.
max_iterations
int
default:"90"
Maximum number of LLM API calls (tool-calling iterations) for the entire session. This budget is shared across the parent agent and any spawned subagents via an IterationBudget instance. Programmatic tool calls via execute_code are refunded and do not count against this limit.
enabled_toolsets
list[str]
default:"None"
Allowlist of toolset names to load. When set, only tools belonging to these toolsets are available. Examples: ["file", "terminal"], ["web", "browser"]. If None, all toolsets are enabled (subject to disabled_toolsets).
disabled_toolsets
list[str]
default:"None"
Denylist of toolset names to exclude. Applied after enabled_toolsets. Use this to strip a single toolset from an otherwise full configuration, e.g. disabled_toolsets=["browser"].
quiet_mode
bool
default:"false"
Suppress all initialization banners, spinner output, and tool-progress prints. Set this to True for programmatic or batch use where stdout cleanliness matters. Log levels for internal modules are also raised to ERROR when quiet mode is active.
save_trajectories
bool
default:"false"
When True, each completed conversation is serialized to a JSONL trajectory file in the standard from/value format used for LLM fine-tuning. The batch_runner.py handles trajectory saving itself and always passes save_trajectories=False to avoid double-writing.
platform
str
default:"None"
The interface platform the user is on. Used to inject platform-specific formatting hints into the system prompt. Recognized values include "cli", "telegram", "discord", "whatsapp", and "slack".
session_id
str
default:"None"
Pre-generated session identifier used for log filenames and the SQLite session store. Auto-generated as YYYYMMDD_HHMMSS_<6-char-hex> when not provided. Pass an existing session ID to resume a conversation in the same log file.
skip_context_files
bool
default:"false"
When True, skips auto-injection of SOUL.md, AGENTS.md, and .cursorrules into the system prompt. Set this to True for batch processing and data generation to prevent user-specific persona files from polluting trajectories.
skip_memory
bool
default:"false"
When True, disables loading of persistent memory (MEMORY.md, USER.md) into the system prompt. Recommended for batch runs where cross-session memory is undesirable.

Provider and API Mode Parameters

base_url
str
default:"None"
Base URL for the LLM API. Defaults to https://openrouter.ai/api/v1 when not provided. Override to target a local model server, direct Anthropic, or direct OpenAI.
api_key
str
default:"None"
API key for authentication. When omitted, the agent resolves credentials from environment variables (e.g. OPENROUTER_API_KEY, ANTHROPIC_API_KEY) via the provider router. Explicit keys take precedence.
provider
str
default:"None"
Provider identifier hint used for routing decisions. Common values: "openrouter", "anthropic", "openai", "openai-codex". When None, the provider is inferred from base_url.
api_mode
str
default:"None"
API protocol override. Accepted values: "chat_completions" (default for OpenRouter/OpenAI-compatible), "anthropic_messages" (Anthropic native API), "codex_responses" (OpenAI Codex Responses API). Inferred automatically from base_url when not set.

OpenRouter Routing Parameters

providers_allowed
list[str]
default:"None"
Allowlist of OpenRouter provider backends (e.g. ["anthropic", "google"]). Requests will only be routed to these providers.
providers_ignored
list[str]
default:"None"
Denylist of OpenRouter provider backends to exclude (e.g. ["together", "deepinfra"]).
providers_order
list[str]
default:"None"
Ordered list of OpenRouter providers to try in sequence (e.g. ["anthropic", "openai", "google"]).
provider_sort
str
default:"None"
Sort OpenRouter providers dynamically by "price", "throughput", or "latency".
provider_require_parameters
bool
default:"false"
When True, only allow OpenRouter providers that support every requested parameter (e.g. max_tokens, reasoning).
provider_data_collection
str
default:"None"
OpenRouter data collection policy. Pass "deny" to opt out of provider data collection on platforms that support it.

Callback Parameters

tool_progress_callback
callable
default:"None"
Called with (tool_name: str, args_preview: str) just before each tool invocation. Use this to surface real-time tool progress in a custom UI.
thinking_callback
callable
default:"None"
Called with the raw thinking/reasoning content string as the model streams it.
reasoning_callback
callable
default:"None"
Alternative reasoning-content callback. Receives structured reasoning data from providers that expose it (e.g. DeepSeek, Qwen).
clarify_callback
callable
default:"None"
Called with (question: str, choices: list) -> str when the agent invokes the clarify tool to ask the user a question. Must return the user’s answer as a string. If None, the clarify tool returns an error message instead of blocking.
step_callback
callable
default:"None"
Called after each completed tool-calling iteration. Useful for progress tracking in long-running tasks.

Generation and Context Parameters

max_tokens
int
default:"None"
Maximum tokens in each model response. Defaults to the model’s native limit when not set. Automatically mapped to max_completion_tokens for direct OpenAI API targets.
reasoning_config
dict
default:"None"
OpenRouter reasoning configuration. Defaults to {"enabled": True, "effort": "medium"} for Claude models. Set {"effort": "none"} to disable chain-of-thought. Set {"effort": "high"} for deeper reasoning on complex tasks.
prefill_messages
list[dict]
default:"None"
Messages prepended to every conversation as prefilled context. Each item is an OpenAI-format message dict: {"role": "user" | "assistant", "content": "..."}. Use for few-shot priming or persistent persona injection.
ephemeral_system_prompt
str
default:"None"
A system prompt that is used during agent execution but is not saved to trajectories or session logs. Useful for injecting ephemeral task context that should not appear in training data.

Miscellaneous Parameters

tool_delay
float
default:"1.0"
Delay in seconds between consecutive tool calls. Increase to avoid rate limits on external APIs; set to 0.0 for maximum throughput in offline batch runs.
verbose_logging
bool
default:"false"
Enable DEBUG-level logging for the agent internals. Suppresses third-party library noise (OpenAI SDK, httpx) but surfaces model call details and tool dispatch traces.
log_prefix
str
default:"\"\""
Prefix string prepended to all console log messages. Useful in parallel batch runs to identify which worker produced a log line (e.g. "[B2:P17]").
log_prefix_chars
int
default:"100"
Maximum characters shown in log previews for tool arguments and responses.
session_db
object
default:"None"
An optional SessionDB instance (SQLite-backed, from gauss_state.py). When provided, all conversation messages are persisted for later retrieval via gauss session commands.
iteration_budget
IterationBudget
default:"None"
A pre-existing IterationBudget instance to share between a parent agent and its subagents. When None, a new budget is created from max_iterations. Pass the parent’s budget to child agents so all subagents consume from the same pool.
fallback_model
dict
default:"None"
A single backup model tried when the primary is unavailable (rate limit, overload, connection failure). Format: {"provider": "openrouter", "model": "anthropic/claude-sonnet-4"}.
checkpoints_enabled
bool
default:"false"
Enable filesystem checkpointing — automatic snapshots of the working directory after destructive operations (file writes, deletes). Stored transparently; not exposed as a tool.
checkpoint_max_snapshots
int
default:"50"
Maximum number of filesystem snapshots to retain when checkpoints_enabled=True. Older snapshots are pruned automatically once this limit is reached.
pass_session_id
bool
default:"false"
When True, injects the session_id into the system prompt so the model is aware of its own session identifier. Useful for multi-session workflows where the model needs to reference itself.

Methods

chat(message)

The simplest way to run the agent — pass a single message string and get a response string back.
def chat(self, message: str, stream_callback: Optional[callable] = None) -> str:
    """Simple interface — returns final response string."""
Parameters:
ParameterTypeDescription
messagestrThe user’s input message.
stream_callbackcallableOptional callback invoked with each text delta during streaming. Used by TTS pipelines to begin audio generation before the full response arrives.
Example:
from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    quiet_mode=True,
    enabled_toolsets=["file"],
)

response = agent.chat("How many Python files are in the current directory?")
print(response)
# → "There are 12 Python files in the current directory."
chat() is a thin wrapper around run_conversation() — it calls it with no system message or history and returns result["final_response"].

run_conversation(user_message, ...)

The full conversation interface. Returns a rich result dictionary that includes the final response, the complete message history, and metadata.
def run_conversation(
    self,
    user_message: str,
    system_message: str = None,
    conversation_history: list = None,
    task_id: str = None,
    stream_callback: Optional[callable] = None,
    persist_user_message: Optional[str] = None,
) -> dict:
    """Full interface — returns dict with final_response + messages."""
Parameters:
ParameterTypeDescription
user_messagestrThe user’s input to process.
system_messagestrOptional system prompt appended to the agent’s default identity prompt. Cached for the session lifetime — do not change between turns.
conversation_historylistPrior messages in OpenAI format to prepend. Enables multi-turn conversations across separate run_conversation calls.
task_idstrUnique identifier for this task. Isolates VM/browser resources per task. Required for parallel batch processing.
stream_callbackcallableOptional callback invoked with each text delta during streaming. Used by TTS pipelines to start audio generation before the full response arrives. When None, the standard non-streaming path is used.
persist_user_messagestrOptional clean user message to store in transcripts and history when user_message contains API-only synthetic prefixes that should not be persisted.
Return value:
{
    "final_response": str,          # The model's last text response
    "last_reasoning": str | None,   # Reasoning content from the last assistant turn (if any)
    "messages": list,               # Full OpenAI-format message history
    "completed": bool,              # True if loop ended cleanly (no tool calls remain)
    "partial": bool,                # True if stopped due to invalid tool calls
    "api_calls": int,               # Number of LLM API calls made
    "interrupted": bool,            # True if an interrupt was requested during the run
    "response_previewed": bool,     # True if the response was streamed/previewed before return
    # "interrupt_message": str      # Only present when interrupted=True and a message triggered it
}
Multi-turn example:
from run_agent import AIAgent

agent = AIAgent(
    model="anthropic/claude-opus-4.6",
    quiet_mode=True,
    enabled_toolsets=["file", "terminal"],
)

# First turn
result1 = agent.run_conversation("Create a file called hello.txt with 'Hello, World!'")
print(result1["final_response"])

# Continue the conversation with the previous history
result2 = agent.run_conversation(
    "Now read that file back and tell me its contents.",
    conversation_history=result1["messages"],
)
print(result2["final_response"])
With a custom system prompt:
result = agent.run_conversation(
    user_message="Analyze the security of this codebase.",
    system_message="You are a security auditor. Focus on authentication flows and input validation.",
    task_id="sec-audit-001",
)

Agent Loop Internals

The conversation loop in run_conversation() is entirely synchronous. At its core:
while api_call_count < self.max_iterations and self.iteration_budget.remaining > 0:
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        tools=tool_schemas,
    )
    if response.tool_calls:
        for tool_call in response.tool_calls:
            result = handle_function_call(tool_call.name, tool_call.args, task_id)
            messages.append(tool_result_message(result))
        api_call_count += 1
    else:
        return response.content  # No more tool calls — done
Key behaviors:
  • Tool dispatch is handled by handle_function_call() from model_tools.py, which routes to tools/registry.py and then to the specific tool implementation.
  • Budget pressure — the agent injects warning text into tool results at 70% and 90% of max_iterations to prompt the model to wrap up.
  • Context compression — when the conversation approaches the model’s context window limit, ContextCompressor summarizes older messages automatically and rebuilds the system prompt.
  • Prompt caching — for Claude models via OpenRouter and native Anthropic, apply_anthropic_cache_control() marks the system prompt and recent messages for caching, reducing input costs by ~75% on long conversations.

Message Format

All messages follow the OpenAI chat completions format:
# User message
{"role": "user", "content": "your message"}

# Assistant message with optional tool calls
{"role": "assistant", "content": "...", "tool_calls": [...], "reasoning": "..."}

# Tool result
{"role": "tool", "tool_call_id": "call_abc123", "content": "{...json...}"}

# System message
{"role": "system", "content": "..."}
Reasoning content from native thinking tokens is stored in assistant_msg["reasoning"], not embedded in content.

Dependency Chain

tools/registry.py      ← no external deps; imported by all tool files

tools/*.py             ← each calls registry.register() at import time

model_tools.py         ← imports tools/registry, triggers tool discovery

run_agent.py           ← imports model_tools; defines AIAgent
When you from run_agent import AIAgent, the full tool registry is initialized. get_tool_definitions() is called in __init__ with your enabled_toolsets / disabled_toolsets filter applied.

ACP Server Mode

The gauss-acp entry point starts an Agent Communication Protocol (ACP) server that exposes the AIAgent over a local socket. This enables deep IDE integration:
gauss acp          # Start the ACP server (used by VS Code, Zed, JetBrains extensions)
The ACP adapter lives in acp_adapter/entry.py and wraps AIAgent with the agent-client-protocol transport. Install the optional dependency with:
pip install "gauss-agent[acp]"

Build docs developers (and LLMs) love