Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jorgeferrando/sdd-skills/llms.txt

Use this file to discover all available pages before exploring further.

SDD includes several strategies to minimize token usage and cost across the workflow. These strategies work together — model selection reduces cost per token, while context management reduces the number of tokens processed in the first place. Neither alone is sufficient; the combination is what makes a long, multi-phase workflow economically viable.

Model selection

Each skill declares the minimum model tier needed in its model_hint frontmatter field. Orchestrators (sdd-agent, sdd-ff, sdd-continue) pass this hint when spawning subagents, so the right model is used for each job automatically.
HintUse forSkills
opusJudgment-heavy phases: design decisions, solution analysissdd-propose, sdd-design
sonnetCode comprehension: analysis, spec writing, implementationsdd-explore, sdd-spec, sdd-apply (subagents), sdd-verify, sdd-audit, sdd-steer, sdd-init, sdd-new, sdd-ff, sdd-discover, sdd-agent
haikuMechanical phases: template-filling, search, dispatchsdd-tasks, sdd-archive, sdd-recall, sdd-docs, sdd-continue, sdd-apply (orchestrator)
Using opus only for propose and design (the judgment phases) while running mechanical phases on haiku can reduce cost by 60–70% compared to running everything on a single high-tier model.

Context management

The artifact chain

Each SDD phase produces a file that captures all decisions made during that phase. Once the phase completes, the conversation context is redundant with the artifact — everything the AI discovered or decided is now on disk:
explore  → notes.md       (findings)
propose  → proposal.md    (scope decisions)
spec     → spec.md        (behavior)
design   → design.md      (architecture)
tasks    → tasks.md       (execution plan)
apply    → commits        (code)
verify   → PR             (result)
This chain is the basis for the context-clearing strategy.

When to clear context

MomentClear?Reason
Between explore and proposeNoCoupled — exploration feeds proposal questions
After proposeYesproposal.md captures everything
After specYesspec.md captures everything
After designYesdesign.md captures everything
After tasksYes (most important)Apply is the longest phase — entering clean saves the most
During applyNoSubagents already isolate context per task
After verifyYesPR created, everything captured

Why this matters

If context is 50K tokens after the propose + spec phases and 15 apply turns remain, that is 50K × 15 = 750K tokens of input carrying stale context that the subagents don’t need. Clearing after tasks and re-reading the artifacts (~5K tokens total) eliminates that cost.

Why /sdd-continue makes this natural

/sdd-continue detects the current phase from artifacts on disk, not from conversation history. You can clear context, start a new session, run /sdd-continue, and the workflow resumes exactly where it left off. Clearing becomes a zero-friction operation rather than a disruptive reset.

Selective steering loading

Skills that read openspec/steering/ load only the specialist files relevant to the current task, not every .md file in the directory. Selection is based on the files the task touches:
  • Specialists with applies_to: all in their manifest → always loaded
  • conventions-testing.md → only when the task touches test files
  • conventions-security.md → only when the task touches auth, API, or input-handling files
  • Other specialists → only when the file matches the specialist’s declared domain
With five or more specialists installed, this reduces steering context from roughly 8KB to roughly 3KB per subagent — a meaningful reduction when multiplied across many tasks.

Prompt caching

Orchestrator skills (sdd-apply, sdd-agent) read steering files once and pass the content inline to subagent prompts. This creates a fixed prefix that is identical across every task in an apply run. LLM prompt caches (5-minute TTL on Claude) hit on this prefix for every sequential subagent, so the steering content is only billed once. The same strategy applies to sdd-discover, which uses an identical prompt prefix across all parallel domain-analysis subagents.

Output style

All skills include a terse output directive. Status reports use tables and single-line bullets instead of prose paragraphs. This reduces response token count — which also has a cost — without losing information density.

English artifacts

All generated artifacts (proposal.md, spec.md, design.md, tasks.md, notes.md) are written in English regardless of the user’s preferred language. English uses approximately 30% fewer tokens than Romance languages (Spanish, French, Portuguese) for the same semantic content, so this is a consistent per-artifact saving across the entire workflow.

Build docs developers (and LLMs) love