Documentation Index
Fetch the complete documentation index at: https://mintlify.com/5unnykum4r/grip-ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Grip AI uses a dual-engine architecture that lets you choose between two execution backends:
- Claude Agent SDK (primary) — Full agentic loop delegated to the Claude CLI
- LiteLLM (fallback) — Internal agent loop supporting 100+ models via LiteLLM
Both engines implement the same EngineProtocol interface, so switching between them requires zero code changes. The factory pattern automatically selects the appropriate engine based on your configuration.
Engine Protocol
Both engines implement three core methods:
class EngineProtocol(ABC):
async def run(
self,
user_message: str,
*,
session_key: str = "cli:default",
model: str | None = None,
) -> AgentRunResult:
"""Send a user message through the engine and return the result."""
async def consolidate_session(self, session_key: str) -> None:
"""Summarise and compact conversation history for a session."""
async def reset_session(self, session_key: str) -> None:
"""Clear all conversation history for a session."""
The AgentRunResult dataclass is unified across both engines:
@dataclass
class AgentRunResult:
response: str
iterations: int = 0
prompt_tokens: int = 0
completion_tokens: int = 0
tool_calls_made: list[str] = field(default_factory=list)
tool_details: list[ToolCallDetail] = field(default_factory=list)
Claude Agent SDK Engine
Architecture
The SDK engine (SDKRunner) delegates the full agentic loop to the Claude Agent SDK. Grip only provides:
- System prompt assembly (identity files, memory, skills)
- Custom tools (send_message, send_file, remember, recall)
- MCP server configuration translation
- History persistence via MemoryManager
When to Use
Use the Claude SDK engine when:
- You’re using Claude models (claude-3-5-sonnet, claude-3-opus, etc.)
- You want the latest Claude agentic capabilities
- You need native computer use support
- You prefer Claude’s native tool execution loop
System Prompt Assembly
The SDK engine builds prompts from multiple sources:
def _build_system_prompt(
self, user_message: str, session_key: str, custom_tools: list | None = None,
) -> str:
parts: list[str] = []
# Identity files (AGENT.md, IDENTITY.md, SOUL.md, USER.md)
identity_files = self._workspace.read_identity_files()
for filename, content in identity_files.items():
parts.append(f"## {filename}\n\n{content}")
# Search long-term memory for relevant facts
memory_results = self._memory_mgr.search_memory(user_message, max_results=5)
if memory_results:
memory_text = "\n".join(f"- {fact}" for fact in memory_results)
parts.append(f"## Relevant Memory\n\n{memory_text}")
# Search conversation history
history_results = self._memory_mgr.search_history(user_message, max_results=5)
if history_results:
history_text = "\n".join(f"- {entry}" for entry in history_results)
parts.append(f"## Relevant History\n\n{history_text}")
# Inject learned behavioral patterns from KnowledgeBase
if self._kb and self._kb.count > 0:
kb_context = self._kb.export_for_context(max_chars=800)
if kb_context:
parts.append(f"## Learned Patterns\n\n{kb_context}")
return "\n\n---\n\n".join(parts)
The SDK engine exposes custom tools via in-process MCP server:
- send_message — Route messages through gateway callbacks
- send_file — Send files via configured channels
- remember — Store facts in MEMORY.md
- recall — Search long-term memory
- stock_quote — (optional) Fetch stock prices if yfinance installed
MCP Server Translation
Grip’s MCP config format is translated to SDK-compatible format:
def _build_mcp_config(self) -> list[dict[str, Any]]:
result: list[dict[str, Any]] = []
for name, srv in self._mcp_servers.items():
if not srv.enabled:
continue
if srv.url:
# URL-based server (SSE transport)
entry = {
"name": name,
"url": srv.url,
"headers": dict(srv.headers),
}
if srv.type:
entry["type"] = srv.type
result.append(entry)
elif srv.command:
# Stdio-based server
result.append({
"name": name,
"command": srv.command,
"args": list(srv.args),
"env": dict(srv.env),
})
return result
LiteLLM Engine
Architecture
The LiteLLM engine (LiteLLMRunner) wraps Grip’s internal AgentLoop stack:
create_provider(config) → LLM provider
create_default_registry(...) → tool registry
- Optionally
SemanticCache(...) if enabled in config
AgentLoop(...) with all dependencies wired together
When to Use
Use the LiteLLM engine when:
- You need non-Claude models (GPT-4, Gemini, Mistral, local models, etc.)
- You want full control over the agent loop
- You need custom provider configurations
- The Claude SDK is not installed or unavailable
Agent Loop
The LiteLLM engine uses Grip’s internal agent loop with:
- Iterative tool execution — Loop until LLM returns text (no tool calls)
- Mid-run compaction — Summarize old messages when context exceeds 50 messages
- Self-correction — Inject reflection prompts when tools fail
- Cost-aware routing — Use cheaper models for simple queries
- Semantic caching — Cache identical queries to save tokens
The LiteLLM engine creates a full tool registry:
# Build the tool registry with any configured MCP servers
self._registry = create_default_registry(mcp_servers=config.tools.mcp_servers)
# Optionally create a semantic cache for duplicate-query savings
cache: SemanticCache | None = None
defaults = config.agents.defaults
if defaults.semantic_cache_enabled:
state_dir = defaults.workspace.expanduser().resolve() / "state"
cache = SemanticCache(
state_dir,
ttl_seconds=defaults.semantic_cache_ttl,
enabled=True,
)
# Wire everything into the AgentLoop
self._loop = AgentLoop(
config,
provider,
workspace,
tool_registry=self._registry,
session_manager=session_mgr,
memory_manager=memory_mgr,
semantic_cache=cache,
trust_manager=trust_mgr,
knowledge_base=knowledge_base,
)
Engine Factory
The create_engine factory reads your config and returns the appropriate engine:
def create_engine(
config: GripConfig,
workspace: WorkspaceManager,
session_mgr: SessionManager,
memory_mgr: MemoryManager,
*,
trust_mgr: TrustManager | None = None,
) -> EngineProtocol:
kb = _create_knowledge_base(config)
engine_choice = config.agents.defaults.engine
engine: EngineProtocol
if engine_choice == "claude_sdk":
try:
sdk_runner_cls = _import_sdk_runner()
logger.info("Using Claude Agent SDK engine (SDKRunner).")
engine = sdk_runner_cls(
config=config,
workspace=workspace,
session_mgr=session_mgr,
memory_mgr=memory_mgr,
trust_mgr=trust_mgr,
knowledge_base=kb,
)
except ImportError:
logger.warning(
"claude_agent_sdk is not installed; falling back to LiteLLM engine. "
"Install it with: pip install claude-agent-sdk"
)
engine = _build_litellm_runner(
config, workspace, session_mgr, memory_mgr, trust_mgr, kb
)
else:
logger.info("Using LiteLLM engine (LiteLLMRunner).")
engine = _build_litellm_runner(config, workspace, session_mgr, memory_mgr, trust_mgr, kb)
# Wrap with behavioral learning (rule-based, zero LLM calls)
from grip.engines.learning import LearningEngine
from grip.memory.pattern_extractor import PatternExtractor
engine = LearningEngine(engine, kb, PatternExtractor())
logger.info("Behavioral pattern learning enabled.")
# Wrap with token tracking if daily limit is configured
max_daily = config.agents.defaults.max_daily_tokens
if max_daily > 0:
from grip.engines.tracked import TrackedEngine
from grip.security.token_tracker import TokenTracker
state_dir = config.agents.defaults.workspace.expanduser().resolve() / "state"
tracker = TokenTracker(state_dir, max_daily)
engine = TrackedEngine(engine, tracker)
logger.info("Token tracking enabled (daily limit: {})", max_daily)
return engine
Switching Engines
Configuration File
Environment Variable
CLI Flag
Edit your grip.yml:agents:
defaults:
# Use Claude SDK engine
engine: claude_sdk
sdk_model: claude-3-5-sonnet-20241022
sdk_permission_mode: interactive
# OR use LiteLLM engine
engine: litellm
model: gpt-4o
Override at runtime:# Use Claude SDK
GRIP_AGENTS__DEFAULTS__ENGINE=claude_sdk grip
# Use LiteLLM
GRIP_AGENTS__DEFAULTS__ENGINE=litellm grip
Specify when launching:# Use Claude SDK with specific model
grip agent --engine claude_sdk --model claude-3-5-sonnet-20241022
# Use LiteLLM with GPT-4
grip agent --engine litellm --model gpt-4o
Automatic Fallback
If you configure engine: claude_sdk but the SDK package is not installed, Grip automatically falls back to LiteLLM:
WARNING: claude_agent_sdk is not installed; falling back to LiteLLM engine.
Install it with: pip install claude-agent-sdk
This ensures Grip always works even if optional dependencies are missing.
Engine Wrappers
Both engines are wrapped with additional capabilities:
Learning Engine
Extracts behavioral patterns from tool executions and stores them in the knowledge base (zero LLM calls, rule-based):
engine = LearningEngine(engine, kb, PatternExtractor())
Tracked Engine
Enforces daily token limits:
if config.agents.defaults.max_daily_tokens > 0:
tracker = TokenTracker(state_dir, max_daily_tokens)
engine = TrackedEngine(engine, tracker)
Configuration Reference
Key configuration options for engines:
agents:
defaults:
# Engine selection
engine: claude_sdk # or litellm
# Claude SDK settings
sdk_model: claude-3-5-sonnet-20241022
sdk_permission_mode: interactive # or approve_all, deny_all
# LiteLLM settings
model: gpt-4o
temperature: 0.7
max_tokens: 4096
max_tool_iterations: 25 # 0 = unlimited
# Token tracking
max_daily_tokens: 1000000 # 0 = no limit
# Semantic cache (LiteLLM only)
semantic_cache_enabled: true
semantic_cache_ttl: 3600 # seconds
Next Steps