Systematic debugging

The systematic-debugging skill is rigid — follow it exactly. Its iron law: find the root cause before attempting any fix. Symptom fixes are failure.

When this skill fires

The skill description reads: “Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes.” This covers:

Test failures
Bugs in production
Unexpected behavior
Performance problems
Build failures
Integration issues

Use this skill especially when:

You’re under time pressure (emergencies make guessing tempting)
A “quick fix” seems obvious
You’ve already tried multiple fixes
You don’t fully understand the issue

Do not skip this skill when the issue seems simple. Simple bugs have root causes too. Do not skip it when you’re in a hurry — systematic debugging is faster than guess-and-check thrashing.

What it does

Systematic debugging walks through four phases in strict order: root cause investigation, pattern analysis, hypothesis testing, and implementation. You cannot move to Phase 2 without completing Phase 1. Each phase has a specific success criterion that must be met before proceeding.

The iron law

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST

If you haven’t completed Phase 1, you cannot propose fixes.

How it works

Phase 1: Root cause investigation

Before attempting any fix, complete all of these:1. Read error messages carefully — don’t skip past errors or warnings. Read stack traces completely. Note line numbers, file paths, and error codes.2. Reproduce consistently — can you trigger it reliably? What are the exact steps? If not reproducible, gather more data — do not guess.3. Check recent changes — what changed that could cause this? Review git diff, recent commits, new dependencies, config changes, environmental differences.4. Gather evidence in multi-component systems — when the system has multiple components (CI → build → signing, API → service → database), add diagnostic instrumentation at each component boundary before proposing fixes:

# Example: multi-layer CI/build system
echo "=== Secrets available in workflow: ==="
echo "IDENTITY: ${IDENTITY:+SET}${IDENTITY:-UNSET}"

Run once to gather evidence showing WHERE it breaks. Then analyze. Then investigate.5. Trace data flow — when the error is deep in the call stack, start at the error line, follow the call stack upward, note each function’s inputs and outputs until you find where expected behavior diverges from actual.Phase 1 success criterion: You must have written this statement before proceeding:

“The bug is: [observable behavior]. Expected: [correct behavior]. First occurrence: [specific file:line or event]. Possible cause: [one hypothesis].”

Phase 2: Pattern analysis

Find the pattern before fixing:

Locate similar working code in the same codebase
If implementing a pattern, read the reference implementation completely — not a skim, every line
List every difference between the working and broken code, however small
Understand all dependencies: components, settings, config, environment

Phase 3: Hypothesis and testing

Apply the scientific method:

Form one hypothesis: “I think X is the root cause because Y” — write it down, be specific
Make the smallest possible change to test the hypothesis — one variable at a time
If it works: proceed to Phase 4
If it doesn’t: form a new hypothesis — do not add more fixes on top

Phase 4: Implementation

Fix the root cause, not the symptom:

Create a failing test case reproducing the bug (load the test-driven-development skill)
Implement a single fix addressing the root cause — one change, no “while I’m here” improvements
Verify: test passes, no other tests broken, issue resolved

If the fix doesn’t work:

Count how many fixes you have tried
If fewer than 3: return to Phase 1 with the new information
If 3 or more: stop and question the architecture (see below)

When 3+ fixes have failed

Three failed fixes indicate an architectural problem, not a hypothesis problem. Look for these patterns:

Each fix reveals new shared state, coupling, or a problem in a different place
Fixes require “massive refactoring” to implement
Each fix creates new symptoms elsewhere

Stop. Ask: Is this pattern fundamentally sound? Are we sticking with it through sheer inertia? Discuss with your human partner before attempting more fixes.

Red flags — stop and follow the process

If you catch yourself thinking any of these, return to Phase 1:

“Quick fix for now, investigate later”
“Just try changing X and see if it works”
“Add multiple changes, run tests”
“Skip the test, I’ll manually verify”
“It’s probably X, let me fix that”
“I don’t fully understand but this might work”
“Here are the main problems: [lists fixes without investigation]”
Proposing solutions before tracing data flow
“One more fix attempt” (when you’ve already tried 2+)

Common rationalizations

Excuse	Reality
”Issue is simple, don’t need process”	Simple issues have root causes too. The process is fast for simple bugs.
”Emergency, no time for process”	Systematic debugging is faster than guess-and-check thrashing.
”Just try this first, then investigate”	First fix sets the pattern. Do it right from the start.
”Multiple fixes at once saves time”	Can’t isolate what worked. Causes new bugs.
”I see the problem, let me fix it”	Seeing symptoms is not understanding root cause.

Real-world impact

From documented debugging sessions:

Approach	Time to fix	First-time fix rate	New bugs introduced
Systematic (this skill)	15–30 minutes	95%	Near zero
Random fixes	2–3 hours thrashing	40%	Common

Example scenario

A test is failing: POST /api/login returns 500 on valid credentials. Phase 1: Read the 500 error stack trace completely. Reproduce with curl. Check recent commits — a password hashing library was updated yesterday. Add logging at each layer: request handler, auth service, database query. Run once. The log shows data enters the auth service correctly but the password comparison returns false for all users. Phase 1 complete. Written statement: “The bug is: password comparison always returns false for valid credentials. Expected: returns true. First occurrence: auth-service/validator.js:47. Possible cause: bcrypt API changed in the library update.” Phase 2: Find other working bcrypt usage in the codebase. Read the bcrypt v6 changelog completely. Identify: the compare function now returns a Promise instead of a callback value. Phase 3: Hypothesis: the auth service is not awaiting the Promise from bcrypt.compare(). Test with the smallest change: add await. Phase 4: Write a failing test reproducing the original bug. Apply the single fix. All tests pass.

Test-driven development

Required in Phase 4 for creating the failing test that reproduces the bug.

Verification before completion

Run after the fix to verify the bug is actually resolved before claiming success.

Get Started

The Workflow

Skills

Commands

Agents

Configuration

When this skill fires

What it does

The iron law

How it works

When 3+ fixes have failed

Red flags — stop and follow the process

Common rationalizations

Real-world impact

Example scenario

Test-driven development

Verification before completion

Build docs developers (and LLMs) love

Get Started

The Workflow

Skills

Commands

Agents

Configuration

Documentation Index

​When this skill fires

​What it does

​The iron law

​How it works

​When 3+ fixes have failed

​Red flags — stop and follow the process

​Common rationalizations

​Real-world impact

​Example scenario

​Related skills

Test-driven development

Verification before completion

Build docs developers (and LLMs) love

When this skill fires

What it does

The iron law

How it works

When 3+ fixes have failed

Red flags — stop and follow the process

Common rationalizations

Real-world impact

Example scenario

Related skills