Documentation Index
Fetch the complete documentation index at: https://mintlify.com/withastro/flue/llms.txt
Use this file to discover all available pages before exploring further.
FlueSession is the conversation thread within a harness. It holds message history and exposes the core methods for driving agent interactions: prompt(), skill(), task(), shell(), and compact(). Obtain a session via harness.session().
Properties
The session name. Defaults to
"default".Out-of-band filesystem access — same interface as
harness.fs. Operations do not appear in the conversation transcript. See FlueHarness for the full FlueFs method table.prompt(text, options?)
Send a user message and get a response. The LLM takes one or more turns (calling tools, thinking, producing text) before returning.
The user message to send.
Schema for structured output. When provided, the response includes a typed
data field and the return type becomes PromptResultResponse<T>.Call-scoped role override. Overrides session and harness roles.
Call-scoped model override (
'provider/model-id'). Overrides role and harness model.Reasoning effort for this call. Overrides role and harness
thinkingLevel.Additional call-scoped tools. Names must not conflict with built-in or harness tools.
Images to attach to the user message. Requires a vision-capable model. Each image is
{ type: 'image', data: base64string, mimeType: string }.Cancel this call. Rejects with
AbortError.CallHandle<PromptResponse> or CallHandle<PromptResultResponse<T>> when result is provided.
skill(name, options?)
Invoke a reusable skill defined as a Markdown file in .agents/skills/. The model reads the skill file from disk at call time — editable mid-session without re-initializing.
Skill name (from frontmatter
name:) or relative path under .agents/skills/.Arguments passed into the skill template.
Schema for structured output.
Call-scoped role override.
Call-scoped model override.
Reasoning effort for this call.
Cancel this call.
CallHandle<PromptResponse> or CallHandle<PromptResultResponse<T>>.
task(text, options?)
Launch a detached child agent in a separate session for focused, one-shot work. Shares the same sandbox and filesystem as the parent, but gets its own message history. The LLM can also call the task tool autonomously during prompt() and skill() calls.
The task prompt for the child agent.
Working directory for the child session. Defaults to the parent session’s
cwd.Schema for structured output.
Role for the child agent.
Model for the child agent.
Cancel this task.
CallHandle<PromptResponse> or CallHandle<PromptResultResponse<T>>.
shell(command, options?)
Run a shell command. Unlike harness.shell(), this call is recorded in the conversation transcript, making the output visible to the model in subsequent turns.
Shell command to execute.
Working directory.
Additional environment variables.
Cancel this call.
CallHandle<ShellResult> — { stdout, stderr, exitCode }.
Use
harness.shell() for setup work the model shouldn’t see (e.g. cloning a repo before starting a session). Use session.shell() when the output should be visible to the model in the conversation.compact()
Trigger context compaction on demand. Summarizes older messages to free context window space — the same operation that runs automatically when the session approaches the token limit.
- Resolves as a no-op if there is nothing to compact.
- Throws if another operation (
prompt,skill,task,shell) is in-flight on this session. - Emits
compaction_start(withreason: "manual") followed bycompactionevents.
delete()
Delete this session’s stored conversation state.
Response types
PromptResponse
Returned by prompt(), skill(), and task() without a result schema.
The model’s final text output.
Aggregated token and cost usage for the entire call (all turns, retries, and any compaction triggered during the call).
The model selected for the call’s primary turn (
{ id: string }).PromptResultResponse<T>
Returned when a result schema is provided.
The schema-validated structured output. Replaces the deprecated
result field.Token and cost usage.
Selected model for the call.
PromptUsage
Input tokens consumed.
Output tokens produced.
Tokens read from prompt cache.
Tokens written to prompt cache.
Sum of all token counts.
Total cost in the provider’s pricing unit (USD for Anthropic, OpenAI, etc.). Computed as
(rate_per_million / 1_000_000) * tokens.