Confidence check

The confidence-check skill prevents wasted effort from implementing with wrong assumptions. It runs a scored self-assessment across three dimensions and gates implementation on the result.

When this skill fires

The skill description reads: “Use before implementing a feature or making significant changes to verify you have enough context and understanding to proceed — prevents wasted effort from proceeding with wrong assumptions.” Specific triggers:

Before starting implementation of a feature
When requirements feel ambiguous
After reading existing code you’re about to modify
When you’re unsure about a technical approach

What it does

The skill scores three dimensions — Understanding, Context, and Approach — on a 0–10 scale. The total score determines whether to proceed, fill gaps first, or stop and discuss. An anti-gaming protocol prevents inflated scores: any dimension scored 8 or higher requires specific named evidence.

How it works

The three dimensions

Understanding (0–10)

Do you understand what this is supposed to do?
Do you understand why it needs to exist?
Do you understand who uses it and how?

Context (0–10)

Do you know which files you need to touch?
Do you understand the existing code patterns you’re building on?
Do you know the data flow end-to-end?

Approach (0–10)

Do you know which approach you’ll take and why?
Have you considered at least one alternative?
Do you know how to test this?

Anti-gaming protocol

For any dimension scored 8 or higher, you must write one specific piece of evidence naming a concrete artifact — a file, a line range, a function name, a specific behavior you observed. General statements do not count. If you cannot name specific evidence for a dimension scored 8+, the maximum score for that dimension is 6.

Valid evidence examples:

“Context: 9/10 — I have read auth.js:45-120 and understand the JWT validation flow, including how the middleware chain passes the decoded token to route handlers.”
“Understanding: 8/10 — The requirement says ‘users can reset their password via email.’ I know it means: generate a one-time token, email a reset link, validate the token on return, and expire it after 1 hour per the spec in JIRA-442.”

Invalid evidence (too vague):

“Context: 9/10 — I understand the codebase well.”
“Understanding: 8/10 — The requirement is clear.”

Scoring thresholds

Score	Action
27–30	Proceed with implementation
20–26	Fill gaps before starting — identify and resolve unclear items
Below 20	Stop — load brainstorming skill or discuss with the user before any code

External validation at 20–26

When total score is in the 20–26 range, state your understanding out loud in one paragraph — what you are about to build, why, and how. Be specific. If the user does not correct it within their next response, proceed. If they do correct it, re-run the check from the beginning with the updated understanding.

Context change handling

If your confidence drops below 20 mid-implementation: Stop immediately. Do not continue writing code. Announce: “Confidence has dropped to [X]/30 due to [specific reason]. Pausing implementation to [specific gap-filling action].” Then execute the gap-filling action. Do not resume until confidence returns to 20+ (with external validation) or 27+ for autonomous continuation.

What to do with low scores

Low on Understanding: Restate the requirement in your own words and ask the user to confirm. Do not assume you understood. Low on Context: Read the relevant files before proceeding. Do not write code that touches files you have not read. Use the deep-research skill for external unknowns. Low on Approach: Load the architecture-design skill. Explore 2–3 options before committing to one.

Example scenario

You’re about to implement a password reset endpoint. Understanding: 8/10 — Evidence: “I know the endpoint needs to: accept an email address, generate a time-limited reset token, send a reset link, validate the token on return, and allow the user to set a new password. Gap: I don’t know whether the token should be single-use.” Context: 6/10 — Evidence: “I have read user.model.js and understand the User schema. I have not yet read the email service. I can see it referenced in order.controller.js but don’t know its interface.” Approach: 7/10 — Evidence: “I will use crypto.randomBytes(32) for token generation, store the SHA-256 hash in a password_reset_tokens table with an expires_at column. Gap: email dispatch step is uncertain.” Total: 21/30 — Fill gaps before starting. Gaps:

Read the email service implementation
Confirm with user: single-use tokens?

State understanding and wait for confirmation before proceeding.

Deep research

Fills the Context gap when external unknowns (third-party APIs, library behavior) are the blocker.

Architecture design

Fills the Approach gap when the technical path is unclear.

Brainstorming

Required when total score falls below 20 — indicates the feature needs design before implementation.

Get Started

The Workflow

Skills

Commands

Agents

Configuration

When this skill fires

What it does

How it works

The three dimensions

Anti-gaming protocol

Scoring thresholds

External validation at 20–26

Context change handling

What to do with low scores

Example scenario

Deep research

Architecture design

Brainstorming

Build docs developers (and LLMs) love

Get Started

The Workflow

Skills

Commands

Agents

Configuration

​When this skill fires

​What it does

​How it works

​The three dimensions

​Anti-gaming protocol

​Scoring thresholds

​External validation at 20–26

​Context change handling

​What to do with low scores

​Example scenario

​Related skills

Deep research

Architecture design

Brainstorming

Build docs developers (and LLMs) love

When this skill fires

What it does

How it works

The three dimensions

Anti-gaming protocol

Scoring thresholds

External validation at 20–26

Context change handling

What to do with low scores

Example scenario

Related skills