Streaming events and stop reasons in pi-ai

pi-ai streams LLM responses as typed events with distinct phases. Every call to stream() returns an AssistantMessageEventStream that emits events you can iterate with for await. Use these events to drive real-time UI updates, track partial tool arguments, or detect errors mid-stream.

Streaming vs. non-streaming functions

Function	Returns	Use when
`stream(model, context, options?)`	Async iterable of all event types	You need fine-grained control over every event
`streamSimple(model, context, options?)`	Async iterable of all event types	You want the unified `reasoning` option instead of provider-specific options
`complete(model, context, options?)`	`Promise<AssistantMessage>`	You don’t need incremental output
`completeSimple(model, context, options?)`	`Promise<AssistantMessage>`	Same, with unified `reasoning` option

stream and complete accept provider-specific options (e.g., thinkingEnabled for Anthropic, reasoningEffort for OpenAI). streamSimple and completeSimple accept a unified reasoning level ('minimal' | 'low' | 'medium' | 'high' | 'xhigh') that is mapped to the provider’s native format automatically.

Complete event reference

Event type	Description	Key properties
`start`	Stream begins	`partial`: initial empty `AssistantMessage`
`text_start`	A text block starts	`contentIndex`: position in content array
`text_delta`	Text chunk received	`delta`: new text; `contentIndex`
`text_end`	Text block complete	`content`: full text; `contentIndex`
`thinking_start`	A thinking block starts	`contentIndex`
`thinking_delta`	Thinking chunk received	`delta`: new thinking text; `contentIndex`
`thinking_end`	Thinking block complete	`content`: full thinking; `contentIndex`
`toolcall_start`	A tool call begins	`contentIndex`
`toolcall_delta`	Tool arguments streaming	`delta`: JSON chunk; `partial.content[contentIndex].arguments`: partial parsed args
`toolcall_end`	Tool call complete	`toolCall`: complete `ToolCall` with `id`, `name`, `arguments`
`done`	Stream complete	`reason`: `"stop" \| "length" \| "toolUse"`; `message`: final `AssistantMessage`
`error`	Error or abort	`reason`: `"error" \| "aborted"`; `error`: `AssistantMessage` with partial content

Events for different content blocks are not guaranteed to be contiguous. A provider may emit text and tool call deltas interleaved in the same upstream chunk, so you may receive text_delta, toolcall_delta, text_delta in sequence. Always use contentIndex to associate each delta with its block — do not assume a block’s start/delta/end sequence is uninterrupted.

Stop reasons

Every AssistantMessage (and every done event) carries a stopReason:

Stop reason	Meaning
`"stop"`	Normal completion — the model finished its response
`"length"`	Output hit the maximum token limit
`"toolUse"`	Model is calling tools and expects tool results before continuing
`"error"`	An error occurred during generation
`"aborted"`	Request was cancelled via `AbortSignal`

Aborting requests

Pass an AbortSignal in the options to cancel an in-progress request. Aborted requests emit an error event with reason: 'aborted' and produce an AssistantMessage with stopReason: 'aborted'.

import { getModel, stream } from '@earendil-works/pi-ai';

const model = getModel('openai', 'gpt-4o-mini');
const controller = new AbortController();

// Abort after 2 seconds
setTimeout(() => controller.abort(), 2000);

const s = stream(model, {
  messages: [{ role: 'user', content: 'Write a long story' }]
}, {
  signal: controller.signal
});

for await (const event of s) {
  if (event.type === 'text_delta') {
    process.stdout.write(event.delta);
  } else if (event.type === 'error') {
    // event.reason is 'error' or 'aborted'
    console.log(`${event.reason === 'aborted' ? 'Aborted' : 'Error'}:`, event.error.errorMessage);
  }
}

const response = await s.result();
if (response.stopReason === 'aborted') {
  console.log('Request was aborted:', response.errorMessage);
  console.log('Partial content received:', response.content);
  console.log('Tokens used:', response.usage);
}

Continuing after abort

Aborted messages can be added to the conversation context and continued in a subsequent request:

const context = {
  messages: [
    { role: 'user', content: 'Explain quantum computing in detail' }
  ]
};

// First request gets aborted after 2 seconds
const controller1 = new AbortController();
setTimeout(() => controller1.abort(), 2000);

const partial = await complete(model, context, { signal: controller1.signal });

// Add the partial response to context
context.messages.push(partial);
context.messages.push({ role: 'user', content: 'Please continue' });

// Continue the conversation
const continuation = await complete(model, context);

Thinking and reasoning

Models that expose a reasoning property support thinking/reasoning mode. Passing reasoning options to a non-reasoning model is silently ignored.

Unified interface (streamSimple)
Provider-specific options (stream/complete)
Streaming thinking content

Use streamSimple or completeSimple with the reasoning option for a provider-agnostic interface:

import { getModel, streamSimple, completeSimple } from '@earendil-works/pi-ai';

const model = getModel('anthropic', 'claude-sonnet-4-20250514');

// Check if model supports reasoning
if (model.reasoning) {
  console.log('Model supports reasoning/thinking');
}

// Use the simplified reasoning option
const response = await completeSimple(model, {
  messages: [{ role: 'user', content: 'Solve: 2x + 5 = 13' }]
}, {
  reasoning: 'medium'  // 'minimal' | 'low' | 'medium' | 'high' | 'xhigh'
});

for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
  } else if (block.type === 'text') {
    console.log('Response:', block.text);
  }
}

Use provider-specific options for fine-grained control:

import { getModel, complete } from '@earendil-works/pi-ai';

// Anthropic: thinkingEnabled + optional token budget
const anthropicModel = getModel('anthropic', 'claude-sonnet-4-20250514');
await complete(anthropicModel, context, {
  thinkingEnabled: true,
  thinkingBudgetTokens: 8192
});

// OpenAI: reasoningEffort + optional summary format
const openaiModel = getModel('openai', 'gpt-5-mini');
await complete(openaiModel, context, {
  reasoningEffort: 'medium',
  reasoningSummary: 'detailed'  // Responses API only
});

// Google: thinking.enabled + optional budget (-1 for dynamic, 0 to disable)
const googleModel = getModel('google', 'gemini-2.5-flash');
await complete(googleModel, context, {
  thinking: {
    enabled: true,
    budgetTokens: 8192
  }
});

Thinking content streams through dedicated events:

const s = streamSimple(model, context, { reasoning: 'high' });

for await (const event of s) {
  switch (event.type) {
    case 'thinking_start':
      console.log('[Model started thinking]');
      break;
    case 'thinking_delta':
      process.stdout.write(event.delta);
      break;
    case 'thinking_end':
      console.log('\n[Thinking complete]');
      break;
    case 'text_delta':
      process.stdout.write(event.delta);
      break;
  }
}

Error handling

When a request ends with an error, the stream emits an error event. The partial AssistantMessage on event.error contains any content received before the failure.

for await (const event of stream) {
  if (event.type === 'error') {
    // event.reason is "error" or "aborted"
    // event.error is the AssistantMessage with partial content
    console.error(`Error (${event.reason}):`, event.error.errorMessage);
    console.log('Partial content:', event.error.content);
  }
}

// The final message always reflects the outcome
const message = await stream.result();
if (message.stopReason === 'error' || message.stopReason === 'aborted') {
  console.error('Request failed:', message.errorMessage);
  console.log('Partial content received:', message.content);
  console.log('Partial token counts:', message.usage);
}

Debugging with onPayload

Use the onPayload callback to inspect the exact request payload sent to the provider. This is useful for diagnosing request formatting issues or unexpected provider errors.

const response = await complete(model, context, {
  onPayload: (payload) => {
    console.log('Provider payload:', JSON.stringify(payload, null, 2));
  }
});

onPayload is supported by stream, complete, streamSimple, and completeSimple.

pi-ai

pi-agent-core

Streaming events and stop reasons in pi-ai

Streaming vs. non-streaming functions

Complete event reference

Stop reasons

Aborting requests

Continuing after abort

Thinking and reasoning

Error handling

Debugging with onPayload

Build docs developers (and LLMs) love

pi-ai

pi-agent-core

Documentation Index

​Streaming vs. non-streaming functions

​Complete event reference

​Stop reasons

​Aborting requests

​Continuing after abort

​Thinking and reasoning

​Error handling

​Debugging with onPayload

Build docs developers (and LLMs) love

Streaming vs. non-streaming functions

Complete event reference

Stop reasons

Aborting requests

Continuing after abort

Thinking and reasoning

Error handling

Debugging with onPayload