When this skill fires
The skill description reads: “Use when implementing any feature or bugfix, before writing implementation code.” This means it fires for:- New features
- Bug fixes
- Refactoring
- Behavior changes
What it does
TDD enforces a strict three-phase cycle — RED, GREEN, REFACTOR — for every piece of new functionality. The defining characteristic is that you must watch the test fail before writing any implementation. A test that passes immediately proves nothing.How it works
The iron law
Red-green-refactor cycle
RED — Write the failing test
Write one minimal test showing what should happen. Requirements:
- Tests one behavior
- Has a clear name describing that behavior
- Uses real code (no mocks unless unavoidable)
Verify RED — Watch it fail (mandatory, never skip)
Run the test:Confirm:
- The test fails (not errors)
- The failure message is what you expected
- It fails because the feature is missing — not because of a typo
GREEN — Write minimal implementation
Write the simplest code that makes the test pass. Do not add features, refactor other code, or “improve” beyond what the test requires.Good example:
Verify GREEN — Confirm all tests pass (mandatory)
Run the full test suite:Confirm:
- The new test passes
- All other tests still pass
- Output is clean (no errors, no warnings)
Testing anti-patterns
Avoid these common mistakes:- Testing mock behavior instead of real behavior — mocks should stand in for dependencies, not be the subject of the test
- Adding test-only methods to production classes — if you need a hook for testing, the design is coupled; use dependency injection instead
- Mocking without reading the real implementation — a wrong mock produces a wrong test
Common rationalizations
| Excuse | Reality |
|---|---|
| ”Too simple to test” | Simple code breaks. The test takes 30 seconds. |
| ”I’ll write tests after” | Tests passing immediately prove nothing. |
| ”Tests after achieve the same goals” | Tests-after = “what does this do?” Tests-first = “what should this do?" |
| "Already manually tested” | Ad-hoc is not systematic. No record, can’t re-run. |
| ”Deleting X hours is wasteful” | Sunk cost fallacy. Unverified code is technical debt. |
| ”TDD will slow me down” | TDD is faster than debugging. Pragmatic = test-first. |
Red flags — stop and start over
Any of these mean: delete the code and start over with TDD.- Code written before the test
- Test passes immediately
- Can’t explain why the test failed
- Tests added “later”
- “Keep as reference” or “adapt existing code”
- “Already spent X hours, deleting is wasteful”
- “TDD is dogmatic, I’m being pragmatic”
Example scenario: bug fix
Bug: Empty email is accepted by the form. RED — write the failing test:Verification checklist
Before marking work complete:- Every new function/method has a test
- Watched each test fail before implementing
- Each test failed for the expected reason (feature missing, not a typo)
- Wrote minimal code to pass each test
- All tests pass
- Output is clean (no errors, no warnings)
- Tests use real code (mocks only if unavoidable)
- Edge cases and errors are covered
Related skills
Systematic debugging
When a bug is found, write a failing test reproducing it first, then follow TDD to fix it.
Verification before completion
Run before claiming the feature is complete — verifies the test suite is passing.
Writing plans
Plans include the exact TDD steps for each implementation task.