Documentation Index
Fetch the complete documentation index at: https://mintlify.com/badlogic/pi-mono/llms.txt
Use this file to discover all available pages before exploring further.
Pi’s AI library supports seamless handoffs between different LLM providers within the same conversation. You can switch models mid-conversation while preserving context, including thinking blocks, tool calls, and tool results.
How It Works
When messages from one provider are sent to a different provider, the library automatically transforms them for compatibility:
- User and tool result messages are passed through unchanged
- Assistant messages from the same provider/API are preserved as-is
- Assistant messages from different providers have their thinking blocks converted to text with
<thinking> tags
- Tool calls and regular text are preserved unchanged
This enables you to start with one model, then switch to another while maintaining conversation continuity.
Quick Example
import { getModel, complete, type Context } from "@mariozechner/pi-ai";
// Start with Claude
const claude = getModel("anthropic", "claude-sonnet-4-20250514");
const context: Context = { messages: [] };
context.messages.push({ role: "user", content: "What is 25 * 18?" });
const claudeResponse = await complete(claude, context, {
thinkingEnabled: true
});
context.messages.push(claudeResponse);
// Switch to GPT-5 - it will see Claude's thinking as <thinking> tagged text
const gpt5 = getModel("openai", "gpt-5-mini");
context.messages.push({ role: "user", content: "Is that calculation correct?" });
const gptResponse = await complete(gpt5, context);
context.messages.push(gptResponse);
// Switch to Gemini
const gemini = getModel("google", "gemini-2.5-flash");
context.messages.push({ role: "user", content: "What was the original question?" });
const geminiResponse = await complete(gemini, context);
With Pi SDK
Use cross-provider handoffs in Pi sessions:
import { getModel } from "@mariozechner/pi-ai";
import { createAgentSession, AuthStorage, ModelRegistry } from "@mariozechner/pi-coding-agent";
const authStorage = AuthStorage.create();
const modelRegistry = new ModelRegistry(authStorage);
const haiku = getModel("anthropic", "claude-haiku-4-5");
const { session } = await createAgentSession({
model: haiku,
authStorage,
modelRegistry,
});
await session.prompt("Analyze the authentication code in this repo");
Change to a different model mid-conversation:
const sonnet = getModel("anthropic", "claude-sonnet-4-5");
await session.setModel(sonnet);
await session.prompt("Now refactor it to use OAuth");
Switch to a completely different provider:
const gpt5 = getModel("openai", "gpt-5-mini");
await session.setModel(gpt5);
await session.prompt("Review the changes for security issues");
Use Cases
Fast to Capable
Specialized Models
Failover
Cost Optimization
Start with a fast model for initial responses, then switch to a more capable model for complex reasoning:// Quick initial scan with Haiku
const haiku = getModel("anthropic", "claude-haiku-4-5");
const { session } = await createAgentSession({ model: haiku });
await session.prompt("Find all authentication code");
// Deep analysis with Opus
const opus = getModel("anthropic", "claude-opus-4-5");
await session.setModel(opus);
await session.prompt("Analyze these files for security vulnerabilities");
Use specialized models for specific tasks:// Code generation with Codex
const codex = getModel("openai-codex", "gpt-5-codex");
const { session } = await createAgentSession({ model: codex });
await session.prompt("Implement the OAuth flow");
// Code review with Claude
const sonnet = getModel("anthropic", "claude-sonnet-4-5");
await session.setModel(sonnet);
await session.prompt("Review the implementation for best practices");
Maintain conversation continuity across provider outages:try {
await session.prompt("Continue the refactoring");
} catch (error) {
if (error.message.includes("rate limit")) {
// Switch to backup provider
const backup = getModel("google", "gemini-2.5-flash");
await session.setModel(backup);
await session.prompt("Continue the refactoring");
}
}
Use cheaper models for simple tasks:// Cheap model for simple queries
const haiku = getModel("anthropic", "claude-haiku-4-5");
const { session } = await createAgentSession({ model: haiku });
await session.prompt("List all test files");
// Expensive model only when needed
const opus = getModel("anthropic", "claude-opus-4-5");
await session.setModel(opus);
await session.prompt("Generate comprehensive test coverage");
Context Serialization
The Context object can be serialized for persistence or transfer:
import { type Context, getModel, complete } from "@mariozechner/pi-ai";
// Create and use a context
const context: Context = {
systemPrompt: "You are a helpful assistant.",
messages: [
{ role: "user", content: "What is TypeScript?" }
]
};
const claude = getModel("anthropic", "claude-sonnet-4-5");
const response = await complete(claude, context);
context.messages.push(response);
// Serialize the entire context
const serialized = JSON.stringify(context);
// Save to database, localStorage, file, etc.
localStorage.setItem("conversation", serialized);
// Later: deserialize and continue with any model
const restored: Context = JSON.parse(localStorage.getItem("conversation")!);
restored.messages.push({ role: "user", content: "Tell me more about its type system" });
const gpt5 = getModel("openai", "gpt-5-mini");
const continuation = await complete(gpt5, restored);
If the context contains images (encoded as base64), those will also be serialized.
Provider Compatibility
All providers can handle messages from other providers:
| Content Type | Compatibility |
|---|
| Text content | ✓ Fully compatible |
| Tool calls | ✓ Fully compatible |
| Tool results (text) | ✓ Fully compatible |
| Tool results (images) | ✓ Fully compatible (for vision models) |
| Thinking blocks | ✓ Converted to tagged text for cross-provider |
| Aborted messages | ✓ Partial content preserved |
Thinking Block Conversion
When switching providers, thinking blocks are transformed:
Same Provider
Different Providers
Thinking blocks are preserved as-is:// Claude → Claude
const sonnet = getModel("anthropic", "claude-sonnet-4-5");
const response = await complete(sonnet, context, { thinkingEnabled: true });
// response.content contains { type: "thinking", thinking: "..." }
// Send to another Claude model
const opus = getModel("anthropic", "claude-opus-4-5");
context.messages.push(response);
await complete(opus, context);
// Opus sees the thinking block natively
Thinking blocks are converted to tagged text:// Claude → GPT-5
const sonnet = getModel("anthropic", "claude-sonnet-4-5");
const response = await complete(sonnet, context, { thinkingEnabled: true });
// response.content contains { type: "thinking", thinking: "Let me calculate..." }
// Send to GPT-5
const gpt5 = getModel("openai", "gpt-5-mini");
context.messages.push(response);
await complete(gpt5, context);
// GPT-5 sees: "<thinking>Let me calculate...</thinking>"
Aborted Messages
Aborted messages can be added to the conversation context and continued:
const controller = new AbortController();
setTimeout(() => controller.abort(), 2000);
const partial = await complete(model, context, { signal: controller.signal });
// partial.stopReason === "aborted"
// partial.content contains partial text/tool calls
// Add to context and continue with different model
context.messages.push(partial);
context.messages.push({ role: "user", content: "Please continue" });
const continuation = await complete(differentModel, context);
Best Practices
Choose Appropriate Models
Match models to task requirements:
Fast models (Haiku, GPT-4o-mini) for simple queries
Balanced models (Sonnet, GPT-5-mini) for general tasks
Powerful models (Opus, GPT-5) for complex reasoning
Track token usage across providers:
let totalCost = 0;
session.subscribe((event) => {
if (event.type === "agent_end") {
for (const msg of event.messages) {
if (msg.role === "assistant") {
totalCost += msg.usage.cost.total;
}
}
console.log(`Total cost: $${totalCost.toFixed(4)}`);
}
});
Handle Provider-Specific Features
Be aware of feature differences:
Thinking/reasoning - Not all models support it
Vision - Check model.input.includes('image')
Context window - Different models have different limits
Tool calling - All Pi-supported models support tools
Test Cross-Provider Flows
Validate that your workflows work across providers:
const providers = [
getModel("anthropic", "claude-sonnet-4-5"),
getModel("openai", "gpt-5-mini"),
getModel("google", "gemini-2.5-flash"),
];
for (const provider of providers) {
const { session } = await createAgentSession({ model: provider });
await session.prompt("Test prompt");
// Verify behavior
}
Example: Multi-Stage Workflow
Here’s a complete example showing a multi-stage workflow with different models:
import { getModel } from "@mariozechner/pi-ai";
import {
createAgentSession,
AuthStorage,
ModelRegistry,
} from "@mariozechner/pi-coding-agent";
const authStorage = AuthStorage.create();
const modelRegistry = new ModelRegistry(authStorage);
// Stage 1: Fast scan with Haiku
const haiku = getModel("anthropic", "claude-haiku-4-5");
const { session } = await createAgentSession({
model: haiku,
authStorage,
modelRegistry,
});
console.log("Stage 1: Quick scan with Haiku");
await session.prompt("Find all authentication-related files");
// Stage 2: Deep analysis with Sonnet + thinking
const sonnet = getModel("anthropic", "claude-sonnet-4-5");
await session.setModel(sonnet);
session.setThinkingLevel("high");
console.log("Stage 2: Analysis with Sonnet + thinking");
await session.prompt("Analyze these files for security vulnerabilities");
// Stage 3: Implementation with GPT-5 Codex
const codex = getModel("openai-codex", "gpt-5-codex");
await session.setModel(codex);
console.log("Stage 3: Implementation with Codex");
await session.prompt("Implement fixes for the vulnerabilities found");
// Stage 4: Review with Opus
const opus = getModel("anthropic", "claude-opus-4-5");
await session.setModel(opus);
console.log("Stage 4: Final review with Opus");
await session.prompt("Review the implementation for correctness");
console.log("Workflow complete!");
Next Steps