LLM Council is not a single model giving a single answer. It is a deliberation system: multiple AI models respond independently, then anonymously evaluate one another, and finally a designated Chairman synthesizes everything into one authoritative reply. The entire flow is asynchronous and parallel wherever possible, so the wall-clock time is only slightly longer than a single slow model call.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/karpathy/llm-council/llms.txt
Use this file to discover all available pages before exploring further.
Full Data Flow
The diagram below traces exactly how a query travels through the system — from raw user input to the structured JSON payload the frontend receives.Why Each Stage Matters
Stage 1 captures uninfluenced first opinions. Every council model receives only the user’s raw question, with no knowledge of what any other model will say. This is the raw material of the deliberation. Stage 2 adds accountability without bias. Models must defend their rankings in writing, but they never know whose response they are grading. This prevents any single model from rubber-stamping a peer it knows to be from a “prestigious” provider. The written evaluation also surfaces why a response was ranked highly, not just where it placed. Stage 3 converts a leaderboard into a usable answer. The Chairman reads every first-opinion response, every peer evaluation, and every ranking, then writes a single synthesized reply that draws on the collective wisdom of the council. No individual model’s blind spots dominate the final output.Anonymization Strategy
Anonymization is the heart of Stage 2’s fairness guarantee. Before any ranking prompts are sent, the backend assigns a letter label to each Stage 1 response:Response A→ first model’s outputResponse B→ second model’s outputResponse C→ third model’s output- … and so on up to
Response Zfor up to 26 council members
label_to_model dictionary — for example {"Response A": "openai/gpt-5.1", "Response B": "anthropic/claude-sonnet-4.5"} — and stores it in the API response’s metadata field. Evaluating models receive only the letter labels; they never see provider names or model identifiers.
De-anonymization happens entirely on the client side. The frontend’s Stage2.jsx component reads the labelToModel prop (populated from metadata.label_to_model) and runs a string-replace pass over each evaluation’s raw text before rendering. The bold model names a user sees in the UI are a display convenience — the underlying evaluation text was written about anonymous letters.
Graceful Degradation
LLM Council is designed never to fail the whole request because one provider had a bad moment.query_models_parallel in openrouter.py uses asyncio.gather() so all model calls race in parallel. Each individual call has a 120-second timeout enforced by the httpx.AsyncClient. If a model returns None (network error, rate-limit, or timeout), that entry is simply filtered out before Stage 1 results are assembled — the pipeline continues with however many responses did succeed.
The only hard failure condition is when all council models fail to respond. In that case run_full_council returns an explicit error dict so the frontend can surface a clear message rather than rendering empty stages.
The Orchestrator: run_full_council
All three stages are wired together by run_full_council in council.py, which is the single entry point called by the API layer for every user message:
Metadata — the
label_to_model mapping and aggregate_rankings list — is ephemeral. It is returned in the API response body but is not written to the JSON conversation store on disk. If you reload a past conversation, the stage text is available but the metadata must be re-derived from context.Explore Each Stage
Stage 1 — Parallel First Opinions
How all council models are queried simultaneously and how failures are filtered before ranking begins.
Stage 2 — Anonymous Peer Review
How responses are anonymized, how ranking prompts are structured, and how votes are aggregated into a leaderboard.
Stage 3 — Chairman Synthesis
How the Chairman model reads all prior context and produces the single final answer shown to the user.
Configuration
How to change which models sit on the council and which model acts as Chairman.