The Andrej Karpathy Skills project captures a key insight from Andrej Karpathy himself: LLMs are exceptionally good at looping until they meet specific goals. The failure mode isn’t execution — it’s goal definition. When a task is stated as a vague imperative (“fix the auth system,” “make it work”), the model has no way to know when it’s done or whether its changes solved the right problem. This principle — Goal-Driven Execution — transforms imperative instructions into declarative goals with verification steps, letting the model drive toward a measurable outcome rather than guess at one.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/multica-ai/andrej-karpathy-skills/llms.txt
Use this file to discover all available pages before exploring further.
The key insight
From Andrej Karpathy:“LLMs are exceptionally good at looping until they meet specific goals… Don’t tell it what to do, give it success criteria and watch it go.”The implication is practical: weak success criteria require constant clarification; strong success criteria let you loop independently. “Make it work” is a weak criterion. “Write a test that reproduces the bug, then make the test pass” is a strong one. The second formulation gives the model a loop condition it can evaluate on its own.
The transformation pattern
Transform imperative tasks into verifiable goals before starting:| Instead of… | Transform to… |
|---|---|
| ”Add validation" | "Write tests for invalid inputs, then make them pass" |
| "Fix the bug" | "Write a test that reproduces it, then make it pass" |
| "Refactor X" | "Ensure tests pass before and after” |
Multi-step plan format
For tasks with more than one phase, state a brief plan before starting. Each step should name an action and a verification:Multi-step plan template
Examples from real requests
Vague vs. verifiable: “Fix the authentication system”
A vague approach produces a list of activities with no measurable outcome. A verifiable approach defines the specific issue first, then works toward a concrete test condition.Bad example — vague plan, no success criteria
Good example — verifiable goals with explicit checks
Test-first verification: “The sorting breaks when there are duplicate scores”
Jumping straight to a fix without reproducing the bug risks solving the wrong problem. The test-first approach confirms the bug exists, fixes it, and confirms it’s gone.Write a test that reproduces the issue
Before changing any production code, write a test that fails in the way the user described. This confirms you understand the bug.
Fix the production code
Now that the bug is confirmed and reproducible, apply the fix. The test tells you exactly what “done” looks like.
Compare this to the alternative: immediately change the sort logic without reproducing the bug first. That approach might fix the issue, but it might also mask it or introduce a different one. The test-first path removes the ambiguity.
What success looks like
This principle is working when:- Tasks have explicit verification steps before implementation begins, not after.
- Multi-step work is planned upfront with a check at each phase boundary.
- Bugs are reproduced before they’re fixed — you can confirm the fix worked.
- Regressions are caught automatically because existing tests run as part of the plan.
Related principles
Think Before Coding
Agree on what the task is before defining success criteria for it.
Surgical Changes
Once you know the goal, change only the lines that achieve it.