Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/earendil-works/pi/llms.txt

Use this file to discover all available pages before exploring further.

pi-ai streams LLM responses as typed events with distinct phases. Every call to stream() returns an AssistantMessageEventStream that emits events you can iterate with for await. Use these events to drive real-time UI updates, track partial tool arguments, or detect errors mid-stream.

Streaming vs. non-streaming functions

FunctionReturnsUse when
stream(model, context, options?)Async iterable of all event typesYou need fine-grained control over every event
streamSimple(model, context, options?)Async iterable of all event typesYou want the unified reasoning option instead of provider-specific options
complete(model, context, options?)Promise<AssistantMessage>You don’t need incremental output
completeSimple(model, context, options?)Promise<AssistantMessage>Same, with unified reasoning option
stream and complete accept provider-specific options (e.g., thinkingEnabled for Anthropic, reasoningEffort for OpenAI). streamSimple and completeSimple accept a unified reasoning level ('minimal' | 'low' | 'medium' | 'high' | 'xhigh') that is mapped to the provider’s native format automatically.

Complete event reference

Event typeDescriptionKey properties
startStream beginspartial: initial empty AssistantMessage
text_startA text block startscontentIndex: position in content array
text_deltaText chunk receiveddelta: new text; contentIndex
text_endText block completecontent: full text; contentIndex
thinking_startA thinking block startscontentIndex
thinking_deltaThinking chunk receiveddelta: new thinking text; contentIndex
thinking_endThinking block completecontent: full thinking; contentIndex
toolcall_startA tool call beginscontentIndex
toolcall_deltaTool arguments streamingdelta: JSON chunk; partial.content[contentIndex].arguments: partial parsed args
toolcall_endTool call completetoolCall: complete ToolCall with id, name, arguments
doneStream completereason: "stop" | "length" | "toolUse"; message: final AssistantMessage
errorError or abortreason: "error" | "aborted"; error: AssistantMessage with partial content
Events for different content blocks are not guaranteed to be contiguous. A provider may emit text and tool call deltas interleaved in the same upstream chunk, so you may receive text_delta, toolcall_delta, text_delta in sequence. Always use contentIndex to associate each delta with its block — do not assume a block’s start/delta/end sequence is uninterrupted.

Stop reasons

Every AssistantMessage (and every done event) carries a stopReason:
Stop reasonMeaning
"stop"Normal completion — the model finished its response
"length"Output hit the maximum token limit
"toolUse"Model is calling tools and expects tool results before continuing
"error"An error occurred during generation
"aborted"Request was cancelled via AbortSignal

Aborting requests

Pass an AbortSignal in the options to cancel an in-progress request. Aborted requests emit an error event with reason: 'aborted' and produce an AssistantMessage with stopReason: 'aborted'.
import { getModel, stream } from '@earendil-works/pi-ai';

const model = getModel('openai', 'gpt-4o-mini');
const controller = new AbortController();

// Abort after 2 seconds
setTimeout(() => controller.abort(), 2000);

const s = stream(model, {
  messages: [{ role: 'user', content: 'Write a long story' }]
}, {
  signal: controller.signal
});

for await (const event of s) {
  if (event.type === 'text_delta') {
    process.stdout.write(event.delta);
  } else if (event.type === 'error') {
    // event.reason is 'error' or 'aborted'
    console.log(`${event.reason === 'aborted' ? 'Aborted' : 'Error'}:`, event.error.errorMessage);
  }
}

const response = await s.result();
if (response.stopReason === 'aborted') {
  console.log('Request was aborted:', response.errorMessage);
  console.log('Partial content received:', response.content);
  console.log('Tokens used:', response.usage);
}

Continuing after abort

Aborted messages can be added to the conversation context and continued in a subsequent request:
const context = {
  messages: [
    { role: 'user', content: 'Explain quantum computing in detail' }
  ]
};

// First request gets aborted after 2 seconds
const controller1 = new AbortController();
setTimeout(() => controller1.abort(), 2000);

const partial = await complete(model, context, { signal: controller1.signal });

// Add the partial response to context
context.messages.push(partial);
context.messages.push({ role: 'user', content: 'Please continue' });

// Continue the conversation
const continuation = await complete(model, context);

Thinking and reasoning

Models that expose a reasoning property support thinking/reasoning mode. Passing reasoning options to a non-reasoning model is silently ignored.
Use streamSimple or completeSimple with the reasoning option for a provider-agnostic interface:
import { getModel, streamSimple, completeSimple } from '@earendil-works/pi-ai';

const model = getModel('anthropic', 'claude-sonnet-4-20250514');

// Check if model supports reasoning
if (model.reasoning) {
  console.log('Model supports reasoning/thinking');
}

// Use the simplified reasoning option
const response = await completeSimple(model, {
  messages: [{ role: 'user', content: 'Solve: 2x + 5 = 13' }]
}, {
  reasoning: 'medium'  // 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'
});

for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
  } else if (block.type === 'text') {
    console.log('Response:', block.text);
  }
}

Error handling

When a request ends with an error, the stream emits an error event. The partial AssistantMessage on event.error contains any content received before the failure.
for await (const event of stream) {
  if (event.type === 'error') {
    // event.reason is "error" or "aborted"
    // event.error is the AssistantMessage with partial content
    console.error(`Error (${event.reason}):`, event.error.errorMessage);
    console.log('Partial content:', event.error.content);
  }
}

// The final message always reflects the outcome
const message = await stream.result();
if (message.stopReason === 'error' || message.stopReason === 'aborted') {
  console.error('Request failed:', message.errorMessage);
  console.log('Partial content received:', message.content);
  console.log('Partial token counts:', message.usage);
}

Debugging with onPayload

Use the onPayload callback to inspect the exact request payload sent to the provider. This is useful for diagnosing request formatting issues or unexpected provider errors.
const response = await complete(model, context, {
  onPayload: (payload) => {
    console.log('Provider payload:', JSON.stringify(payload, null, 2));
  }
});
onPayload is supported by stream, complete, streamSimple, and completeSimple.

Build docs developers (and LLMs) love