TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/math-inc/OpenGauss/llms.txt
Use this file to discover all available pages before exploring further.
AIAgent class in run_agent.py is the core of OpenGauss — a synchronous conversation loop that calls an LLM, dispatches tool calls, and keeps iterating until the model returns a plain response. You can import it directly and drive it programmatically without the interactive CLI.
Installation
- PyPI
- From source
- Full extras
Entry Points
Installing the package registers three CLI entry points defined inpyproject.toml:
| Command | Module | Purpose |
|---|---|---|
gauss | gauss_cli.main:main | Interactive CLI (TUI, slash commands, skin engine) |
gauss-agent | run_agent:main | Headless agent runner — single prompt or Fire CLI |
gauss-acp | acp_adapter.entry:main | ACP server for VS Code, Zed, and JetBrains integration |
Quick Start
Here is the minimal pattern for programmatic use. Additional examples follow the full parameter reference below.Constructor: AIAgent.__init__
Core Parameters
Model identifier in OpenRouter format (e.g.
"anthropic/claude-opus-4.6", "openai/gpt-4o"). Pass a bare model name when using a direct provider via base_url.Maximum number of LLM API calls (tool-calling iterations) for the entire session. This budget is shared across the parent agent and any spawned subagents via an
IterationBudget instance. Programmatic tool calls via execute_code are refunded and do not count against this limit.Allowlist of toolset names to load. When set, only tools belonging to these toolsets are available. Examples:
["file", "terminal"], ["web", "browser"]. If None, all toolsets are enabled (subject to disabled_toolsets).Denylist of toolset names to exclude. Applied after
enabled_toolsets. Use this to strip a single toolset from an otherwise full configuration, e.g. disabled_toolsets=["browser"].Suppress all initialization banners, spinner output, and tool-progress prints. Set this to
True for programmatic or batch use where stdout cleanliness matters. Log levels for internal modules are also raised to ERROR when quiet mode is active.When
True, each completed conversation is serialized to a JSONL trajectory file in the standard from/value format used for LLM fine-tuning. The batch_runner.py handles trajectory saving itself and always passes save_trajectories=False to avoid double-writing.The interface platform the user is on. Used to inject platform-specific formatting hints into the system prompt. Recognized values include
"cli", "telegram", "discord", "whatsapp", and "slack".Pre-generated session identifier used for log filenames and the SQLite session store. Auto-generated as
YYYYMMDD_HHMMSS_<6-char-hex> when not provided. Pass an existing session ID to resume a conversation in the same log file.When
True, skips auto-injection of SOUL.md, AGENTS.md, and .cursorrules into the system prompt. Set this to True for batch processing and data generation to prevent user-specific persona files from polluting trajectories.When
True, disables loading of persistent memory (MEMORY.md, USER.md) into the system prompt. Recommended for batch runs where cross-session memory is undesirable.Provider and API Mode Parameters
Base URL for the LLM API. Defaults to
https://openrouter.ai/api/v1 when not provided. Override to target a local model server, direct Anthropic, or direct OpenAI.API key for authentication. When omitted, the agent resolves credentials from environment variables (e.g.
OPENROUTER_API_KEY, ANTHROPIC_API_KEY) via the provider router. Explicit keys take precedence.Provider identifier hint used for routing decisions. Common values:
"openrouter", "anthropic", "openai", "openai-codex". When None, the provider is inferred from base_url.API protocol override. Accepted values:
"chat_completions" (default for OpenRouter/OpenAI-compatible), "anthropic_messages" (Anthropic native API), "codex_responses" (OpenAI Codex Responses API). Inferred automatically from base_url when not set.OpenRouter Routing Parameters
Allowlist of OpenRouter provider backends (e.g.
["anthropic", "google"]). Requests will only be routed to these providers.Denylist of OpenRouter provider backends to exclude (e.g.
["together", "deepinfra"]).Ordered list of OpenRouter providers to try in sequence (e.g.
["anthropic", "openai", "google"]).Sort OpenRouter providers dynamically by
"price", "throughput", or "latency".When
True, only allow OpenRouter providers that support every requested parameter (e.g. max_tokens, reasoning).OpenRouter data collection policy. Pass
"deny" to opt out of provider data collection on platforms that support it.Callback Parameters
Called with
(tool_name: str, args_preview: str) just before each tool invocation. Use this to surface real-time tool progress in a custom UI.Called with the raw thinking/reasoning content string as the model streams it.
Alternative reasoning-content callback. Receives structured reasoning data from providers that expose it (e.g. DeepSeek, Qwen).
Called with
(question: str, choices: list) -> str when the agent invokes the clarify tool to ask the user a question. Must return the user’s answer as a string. If None, the clarify tool returns an error message instead of blocking.Called after each completed tool-calling iteration. Useful for progress tracking in long-running tasks.
Generation and Context Parameters
Maximum tokens in each model response. Defaults to the model’s native limit when not set. Automatically mapped to
max_completion_tokens for direct OpenAI API targets.OpenRouter reasoning configuration. Defaults to
{"enabled": True, "effort": "medium"} for Claude models. Set {"effort": "none"} to disable chain-of-thought. Set {"effort": "high"} for deeper reasoning on complex tasks.Messages prepended to every conversation as prefilled context. Each item is an OpenAI-format message dict:
{"role": "user" | "assistant", "content": "..."}. Use for few-shot priming or persistent persona injection.A system prompt that is used during agent execution but is not saved to trajectories or session logs. Useful for injecting ephemeral task context that should not appear in training data.
Miscellaneous Parameters
Delay in seconds between consecutive tool calls. Increase to avoid rate limits on external APIs; set to
0.0 for maximum throughput in offline batch runs.Enable
DEBUG-level logging for the agent internals. Suppresses third-party library noise (OpenAI SDK, httpx) but surfaces model call details and tool dispatch traces.Prefix string prepended to all console log messages. Useful in parallel batch runs to identify which worker produced a log line (e.g.
"[B2:P17]").Maximum characters shown in log previews for tool arguments and responses.
An optional
SessionDB instance (SQLite-backed, from gauss_state.py). When provided, all conversation messages are persisted for later retrieval via gauss session commands.A pre-existing
IterationBudget instance to share between a parent agent and its subagents. When None, a new budget is created from max_iterations. Pass the parent’s budget to child agents so all subagents consume from the same pool.A single backup model tried when the primary is unavailable (rate limit, overload, connection failure). Format:
{"provider": "openrouter", "model": "anthropic/claude-sonnet-4"}.Enable filesystem checkpointing — automatic snapshots of the working directory after destructive operations (file writes, deletes). Stored transparently; not exposed as a tool.
Maximum number of filesystem snapshots to retain when
checkpoints_enabled=True. Older snapshots are pruned automatically once this limit is reached.When
True, injects the session_id into the system prompt so the model is aware of its own session identifier. Useful for multi-session workflows where the model needs to reference itself.Methods
chat(message)
The simplest way to run the agent — pass a single message string and get a response string back.
| Parameter | Type | Description |
|---|---|---|
message | str | The user’s input message. |
stream_callback | callable | Optional callback invoked with each text delta during streaming. Used by TTS pipelines to begin audio generation before the full response arrives. |
chat() is a thin wrapper around run_conversation() — it calls it with no system message or history and returns result["final_response"].
run_conversation(user_message, ...)
The full conversation interface. Returns a rich result dictionary that includes the final response, the complete message history, and metadata.
| Parameter | Type | Description |
|---|---|---|
user_message | str | The user’s input to process. |
system_message | str | Optional system prompt appended to the agent’s default identity prompt. Cached for the session lifetime — do not change between turns. |
conversation_history | list | Prior messages in OpenAI format to prepend. Enables multi-turn conversations across separate run_conversation calls. |
task_id | str | Unique identifier for this task. Isolates VM/browser resources per task. Required for parallel batch processing. |
stream_callback | callable | Optional callback invoked with each text delta during streaming. Used by TTS pipelines to start audio generation before the full response arrives. When None, the standard non-streaming path is used. |
persist_user_message | str | Optional clean user message to store in transcripts and history when user_message contains API-only synthetic prefixes that should not be persisted. |
Agent Loop Internals
The conversation loop inrun_conversation() is entirely synchronous. At its core:
- Tool dispatch is handled by
handle_function_call()frommodel_tools.py, which routes totools/registry.pyand then to the specific tool implementation. - Budget pressure — the agent injects warning text into tool results at 70% and 90% of
max_iterationsto prompt the model to wrap up. - Context compression — when the conversation approaches the model’s context window limit,
ContextCompressorsummarizes older messages automatically and rebuilds the system prompt. - Prompt caching — for Claude models via OpenRouter and native Anthropic,
apply_anthropic_cache_control()marks the system prompt and recent messages for caching, reducing input costs by ~75% on long conversations.
Message Format
All messages follow the OpenAI chat completions format:assistant_msg["reasoning"], not embedded in content.
Dependency Chain
from run_agent import AIAgent, the full tool registry is initialized. get_tool_definitions() is called in __init__ with your enabled_toolsets / disabled_toolsets filter applied.
ACP Server Mode
Thegauss-acp entry point starts an Agent Communication Protocol (ACP) server that exposes the AIAgent over a local socket. This enables deep IDE integration:
acp_adapter/entry.py and wraps AIAgent with the agent-client-protocol transport. Install the optional dependency with: