Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jorgeferrando/sdd-skills/llms.txt

Use this file to discover all available pages before exploring further.

SDD has two concepts that often get conflated: a skill (what the AI knows how to do) and an agent (how it’s executed). Understanding the difference clarifies why the workflow is structured the way it is, why some phases feel “heavier” than others, and how context is managed across a long-running change. The split is intentional — it is the mechanism that keeps context clean, enables model selection, and makes prompt caching possible.

Definitions

Skill

A skill is a markdown instruction file with YAML frontmatter. It lives in skills/sdd-{name}/instructions.md and tells the AI how to perform one step of the workflow.
---
name: sdd-spec
description: SDD Spec - write behavior specs. Usage - /sdd-spec or /sdd-spec {domain}.
model_hint: sonnet
---

# SDD Spec
...
Skills are static content. They don’t run on their own — they are loaded and followed by an AI. The model_hint field signals to orchestrators which model tier to use when spawning a subagent for this skill.

Agent

An agent is a running AI instance with its own conversation context. There are two kinds in SDD:
KindContextModelExample
OrchestratorYour main conversationWhatever you picked in the clientThe one you’re talking to right now
SubagentFresh, isolatedChosen via model_hintSpawned by /sdd-apply per task
A subagent starts with no memory of your conversation. The orchestrator hands it a self-contained prompt — instructions plus all the context it needs — the subagent runs, and it returns a summary. Its context is then discarded.

Inline vs subagent execution

Most SDD skills run inline — you invoke the slash command, the AI loads the skill’s instructions, and executes them in your current conversation. Four skills use a different pattern:
SkillModeWhy
/sdd-designSubagentDesign analysis is self-contained; isolating it keeps the main context clean
/sdd-applyOrchestrator + one subagent per taskEach task implementation needs full file-reading context; running inline would bloat the main conversation
/sdd-verifySubagentRuns tests, linters, smoke checks — produces a report, no interactive decisions needed
/sdd-discoverParallel subagentsDomain detection fan-out — each subagent analyzes one domain simultaneously
Everything else — propose, spec, tasks, archive, audit, steer, init, new, ff, continue, recall, docs — runs inline in your current conversation.

Why this split matters

Context hygiene

The main conversation is finite (around 200K tokens effective). If /sdd-apply ran inline and read every file for every task, the context would fill fast and quality would degrade. By spawning one subagent per task, each task gets a fresh, focused context — the orchestrator only sees the returned summary, not all the file contents that went into producing it.

Model selection

The model_hint in each skill tells orchestrators (sdd-agent, sdd-ff, sdd-continue, sdd-apply) which tier to spawn subagents on:
  • opus — judgment-heavy phases: propose, design
  • sonnet — comprehension-heavy phases: explore, spec, apply (per-task subagents), verify, audit, steer, init, new, ff, discover, agent
  • haiku — mechanical phases: tasks, archive, recall, docs, continue (dispatcher), apply (orchestrator)
The orchestrator itself may run on a different model than its subagents. /sdd-apply is a good example: the orchestrator runs on haiku (it just tracks task state and dispatches), while each per-task subagent runs on sonnet (it writes real code).

Prompt caching

Because subagents share a fixed prompt prefix — steering content loaded once by the orchestrator and passed inline — sequential subagents benefit from LLM prompt caching (5-minute TTL) across back-to-back task runs. See Token Optimization: Prompt Caching for details.

Mental model

The orchestrator is always the one talking to you. Subagents are short-lived workers whose output is a report, not a conversation.

Practical implications

Three consequences of this architecture matter in day-to-day use: Clearing context (/clear, new session) affects the orchestrator, not past subagents. Subagents don’t persist — they’re already gone by the time you see their summary. Clearing the orchestrator context is safe at any phase boundary. Interactive questions (proposals, design decisions, task review) must happen in the orchestrator. A subagent cannot ask you anything mid-run — it either completes its task or reports a blocker back to the orchestrator, which then surfaces it to you. /sdd-continue works from fresh sessions because it detects the current phase from artifacts on disk, not from conversation history. You can start a brand-new conversation at any point in the workflow and /sdd-continue will pick up exactly where you left off. See Token Optimization: When to Clear Context for the recommended clearing schedule.

Build docs developers (and LLMs) love