Claude Models
Claude models are accessed through the proxy using Anthropic’s API format. All Claude models support extended thinking capabilities.claude-opus-4-6-thinking
Most Capable ModelClaude Opus 4.6 with extended thinking capabilities. Best for complex reasoning tasks and challenging problems.
- Extended thinking output
- Highest capability tier
- Best for multi-step reasoning
claude-sonnet-4-5-thinking
Balanced PerformanceClaude Sonnet 4.5 with extended thinking. Excellent balance of speed and capability.
- Extended thinking output
- Fast response times
- Ideal for general coding tasks
claude-sonnet-4-5
Standard ModelClaude Sonnet 4.5 without thinking output. Fastest response times for straightforward tasks.
- No thinking blocks
- Maximum speed
- Best for simple operations
Claude Thinking Models
Claude thinking models include an internal reasoning process before generating their final response:- signature field: Claude uses the
signaturefield on thinking blocks for multi-turn conversations - Thinking blocks: Extended reasoning is included in the response as separate content blocks
- Cache support: Thinking signatures are cached for conversation continuity
Gemini Models
Gemini models provide high-performance alternatives with Google’s latest AI technology. All Gemini models version 3+ include thinking capabilities.gemini-3.1-pro-high
High Performance TierGemini 3.1 Pro High with advanced thinking. Best for demanding workloads.
- Extended thinking via
thoughtSignature - High quota allocation
- Best for production use
gemini-3.1-pro-low
Balanced TierGemini 3.1 Pro Low with thinking support. Good balance of performance and quota.
- Extended thinking support
- Moderate quota allocation
- General purpose use
gemini-3-flash
Fast ResponsesGemini 3 Flash with thinking. Optimized for speed and efficiency.
- Quick response times
- Extended thinking included
- Ideal for rapid iteration
Gemini Thinking Models
Gemini models (version 3 and higher) support thinking capabilities:- thoughtSignature field: Gemini uses the
thoughtSignaturefield onfunctionCallparts - Automatic detection: Models with “thinking” in the name or version 3+ are treated as thinking models
- Signature caching:
thoughtSignaturevalues are cached for 2 hours - Fallback handling: If Claude Code strips the signature, the proxy restores it from cache or uses a sentinel value
Model Selection in Claude Code
Configure your Claude Code CLI to use specific models:- Configuration File
- Environment Variables
- WebUI Presets
Edit
~/.claude/settings.json:Model Naming Conventions
The proxy uses consistent naming patterns for model identification:Thinking vs Non-Thinking
| Pattern | Type | Example |
|---|---|---|
Contains thinking | Thinking model | claude-sonnet-4-5-thinking |
| Version 3+ (Gemini) | Thinking model | gemini-3-flash |
No thinking + version 2.x or older | Standard model | claude-sonnet-4-5 |
Model Family Detection
The proxy automatically detects model families:- Claude family: Model name contains
claude - Gemini family: Model name contains
gemini
Context Window Limits
Gemini models have a maximum output token limit of 16,384 tokens. The proxy automatically enforces this limit.
Quota and Rate Limits
Each model has different quota allocations based on your subscription tier:- Ultra tier: Highest quota limits across all models
- Pro tier: Moderate quota limits
- Free tier: Basic quota limits
/account-limits API endpoint.
Cross-Model Conversations
The proxy handles cross-model scenarios:- Signature validation: Checks if cached signatures match the target model family
- Automatic cleanup: Drops incompatible signatures when switching families
- Recovery mechanism: Injects synthetic messages to close interrupted tool loops
Best Practices
- Stick to one family: Use either Claude or Gemini for the entire conversation
- Use fallback strategically: Enable fallback only when needed to maintain signature compatibility
- Monitor model switches: Check logs for signature cleanup warnings
Testing Models
The proxy includes test utilities for each model family:- Claude:
claude-sonnet-4-5-thinking - Gemini:
gemini-3-flash