Documentation Index Fetch the complete documentation index at: https://mintlify.com/badlogic/pi-mono/llms.txt
Use this file to discover all available pages before exploring further.
The @mariozechner/pi-ai package provides functions for streaming LLM responses and awaiting completion results.
Import
import { stream , streamSimple , complete , completeSimple } from "@mariozechner/pi-ai" ;
stream
Stream LLM responses with provider-specific options.
stream < TApi extends Api > (
model : Model < TApi > ,
context : Context ,
options ?: ProviderStreamOptions
): AssistantMessageEventStream
The model to use for generation
Conversation context with system prompt, messages, and tools System prompt for the model
Conversation messages (user, assistant, tool results)
Available tools the model can invoke
Sampling temperature (0-2). Default varies by provider.
Maximum tokens to generate
API key for authentication
Abort signal for cancellation
transport
'sse' | 'websocket' | 'auto'
Transport protocol (for providers that support multiple)
cacheRetention
'none' | 'short' | 'long'
Prompt cache retention preference. Default: 'short'
Session identifier for session-based caching
Provider-specific metadata
Return Value
Returns an AssistantMessageEventStream that emits events:
text - Text content delta
thinking - Reasoning content (for extended thinking models)
tool_call - Tool invocation request
usage - Token usage information
stop - Generation finished
error - Error occurred
Example
import { stream , getModel } from "@mariozechner/pi-ai" ;
const model = getModel ( "anthropic" , "claude-4.5-sonnet-20250514" );
const eventStream = stream ( model , {
systemPrompt: "You are a helpful assistant." ,
messages: [
{ role: "user" , content: "What is 2+2?" , timestamp: Date . now () }
],
}, {
temperature: 0.7 ,
maxTokens: 1000 ,
});
for await ( const event of eventStream ) {
if ( event . type === "text" ) {
process . stdout . write ( event . delta );
} else if ( event . type === "stop" ) {
console . log ( " \n Finished!" );
}
}
streamSimple
Stream LLM responses with unified reasoning parameter.
streamSimple < TApi extends Api > (
model : Model < TApi > ,
context : Context ,
options ?: SimpleStreamOptions
): AssistantMessageEventStream
Same as stream() but accepts a reasoning parameter instead of provider-specific thinking options.
Thinking level: "off", "minimal", "low", "medium", "high", or "xhigh" Automatically mapped to provider-specific parameters (e.g., thinking.type for OpenAI, thinking.enabled for Anthropic).
Custom token budgets for thinking levels (token-based providers only) Token budget for minimal thinking
Token budget for low thinking
Token budget for medium thinking
Token budget for high thinking
Example
import { streamSimple , getModel } from "@mariozechner/pi-ai" ;
const model = getModel ( "openai" , "gpt-5.3-codex" );
const eventStream = streamSimple ( model , {
systemPrompt: "You are a coding assistant." ,
messages: [
{ role: "user" , content: "Write a binary search in TypeScript" , timestamp: Date . now () }
],
}, {
reasoning: "medium" , // Maps to thinking.type: "medium"
temperature: 0.7 ,
});
for await ( const event of eventStream ) {
if ( event . type === "thinking" ) {
console . log ( `[Thinking] ${ event . delta } ` );
} else if ( event . type === "text" ) {
process . stdout . write ( event . delta );
}
}
complete
Wait for the full response from stream().
complete < TApi extends Api > (
model : Model < TApi > ,
context : Context ,
options ?: ProviderStreamOptions
): Promise < AssistantMessage >
Returns the complete AssistantMessage after streaming finishes.
Example
import { complete , getModel } from "@mariozechner/pi-ai" ;
const model = getModel ( "anthropic" , "claude-4.5-sonnet-20250514" );
const result = await complete ( model , {
systemPrompt: "You are a helpful assistant." ,
messages: [
{ role: "user" , content: "Explain quantum computing" , timestamp: Date . now () }
],
});
console . log ( result . content ); // Full response
console . log ( result . usage ); // Token usage
completeSimple
Wait for the full response from streamSimple().
completeSimple < TApi extends Api > (
model : Model < TApi > ,
context : Context ,
options ?: SimpleStreamOptions
): Promise < AssistantMessage >
Example
import { completeSimple , getModel } from "@mariozechner/pi-ai" ;
const model = getModel ( "openai" , "gpt-5.3-codex" );
const result = await completeSimple ( model , {
systemPrompt: "You are a coding assistant." ,
messages: [
{ role: "user" , content: "Write a quicksort" , timestamp: Date . now () }
],
}, {
reasoning: "high" ,
});
console . log ( result . content ); // Full response with thinking
AssistantMessageEventStream
Event stream interface with async iteration and helper methods.
Methods
result
Wait for the complete assistant message.
result (): Promise < AssistantMessage >
Subscribe to specific event types.
on < T extends AssistantMessageEvent > (
type : T [ 'type' ],
handler : ( event : T ) => void
): void
Event Types:
text - Text content delta
thinking - Thinking content delta
tool_call - Tool call request
usage - Token usage
stop - Stream finished
error - Error occurred
abort
Cancel the stream.
Example
const eventStream = stream ( model , context );
// Subscribe to events
eventStream . on ( "text" , ( event ) => {
console . log ( "Text:" , event . delta );
});
eventStream . on ( "usage" , ( event ) => {
console . log ( "Tokens:" , event . usage );
});
// Await result
const message = await eventStream . result ();
console . log ( "Final:" , message );
Context Type
interface Context {
systemPrompt : string ;
messages : Message [];
tools ?: Tool [];
}
ThinkingLevel Type
type ThinkingLevel = "off" | "minimal" | "low" | "medium" | "high" | "xhigh" ;
Note: "xhigh" is only supported by GPT-5.2/5.3 models and Anthropic Opus 4.6 models.