Every large language model has its own prompt personality. Claude thrives on rich XML structure; o3 breaks when you add chain-of-thought scaffolding; Gemini drifts into hallucinated citations without a grounding anchor; Ollama’s behavior is entirely determined by the model underneath it. Prompt Master routes your request to the correct framework for each target automatically — this page documents every rule it applies so you understand exactly why your generated prompt looks the way it does.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/nidhinjs/prompt-master/llms.txt
Use this file to discover all available pages before exploring further.
The single most impactful split in LLM prompting is between standard instruction-following models (Claude, GPT, Gemini, Qwen, Llama) and reasoning-native models (o3/o4-mini, DeepSeek-R1, Qwen3 thinking mode). Standard models benefit from explicit chain-of-thought guidance. Reasoning-native models have built-in adaptive thinking — adding CoT instructions actively degrades their output quality. Prompt Master detects the target model and strips CoT automatically.
Claude 4.x (Opus 4.8 default / 4.7)
Claude 4.x is the default model when no specific LLM is named. Opus 4.8 is the default variant. Claude responds best to precise, front-loaded instructions delivered in a literal tone — it follows what you write, not what you imply.- Front-load everything — put the most important instructions at the top of the prompt, not buried in context
- Use XML tags for complex prompts — wrap sections in
<instructions>,<context>,<examples>,<output>for reliable parsing - Adaptive thinking is built-in — never add explicit chain-of-thought instructions (CoT); Claude 4.x handles reasoning internally
- 1M context window — safe for large codebases and long documents; no truncation anxiety
- Literal instruction following — specify precisely what you want; Claude will not infer unstated preferences
- Append for agentic tasks:
Only make changes directly requested.— prevents Claude from over-engineering solutions - Template: Template M (agentic tasks), CO-STAR (structured requests), RTF (quick tasks)
ChatGPT / GPT-5.x
GPT-5.x is highly capable at long-context synthesis and following output contracts. The guiding principle is smallest prompt that works — GPT responds well to concise, contract-first instructions and tends to verbosity when given room.- Smallest prompt that works — strip every word that doesn’t carry instruction weight
- Explicit output contract — state the exact format, length, and structure you expect up front
- Compact structured outputs — prefer JSON, numbered lists, and tables over free prose responses
- Constrain verbosity with word counts —
Respond in under 150 wordsorReturn exactly 3 bullet pointsprevents runaway outputs - Strong at long-context synthesis — safe to provide large documents for summarization, analysis, or extraction tasks
o3 / o4-mini
o3 and o4-mini are reasoning-native models with internal chain-of-thought. Longer, more elaborate prompts do not improve their output — they degrade it. These models need space to think, not scaffolding.- SHORT, clean instructions only — keep the full prompt concise; clarity beats completeness here
- System prompts under 200 words — longer system prompts reduce reasoning quality
- Zero-shot first — do not add few-shot examples unless the task genuinely requires them; examples constrain reasoning
- No CoT scaffolding — never add step-by-step reasoning instructions (see warning above)
- State the goal, not the method — describe what you want, not how to get there
Gemini 2.x / 3
Gemini 2.x and 3 excel at long-context tasks and multimodal inputs (text, images, documents). The main failure mode is hallucinated citations — Gemini will confidently cite sources that don’t exist without a grounding constraint.- Strong long-context and multimodal capability — safe for document-heavy, image-paired, and mixed-media tasks
- Add a grounding anchor — explicitly instruct
Only cite sources provided in this promptorDo not fabricate citationsto prevent hallucinated references - Explicit format locks — state the output structure precisely; Gemini drifts into verbose prose without a hard format constraint
- Avoid open-ended “discuss” prompts — prefer specific deliverables (
return a table,write exactly 5 points)
Qwen 2.5
Qwen 2.5 has excellent instruction-following capability and handles JSON output reliably. It works best with a well-defined system prompt and focused task scope — spreading the request across a long prompt reduces accuracy.- Clear, explicit system prompt — Qwen 2.5 uses the system prompt effectively; invest in it
- Shorter, focused prompts — concentrated task descriptions outperform sprawling multi-part instructions
- Strong at JSON and structured output — the safest choice for structured extraction tasks when using open-weight models
- Explicit instruction following — state each requirement as a separate line item rather than embedding in prose
Qwen3 (Thinking vs. Non-Thinking Mode)
Qwen3 operates in two distinct modes that require completely different prompting strategies.- Thinking Mode
- Non-Thinking Mode
Qwen3 thinking mode is reasoning-native — treat it exactly like o3.
- No CoT instructions — built-in adaptive reasoning; adding step-by-step guidance degrades output
- Short, clean instructions — same rules as o3/o4-mini apply
- Zero-shot preferred — avoid few-shot examples unless strictly required
- State the goal only — do not prescribe a reasoning method
Ollama
Ollama is a runtime, not a model — the model running underneath it determines the actual prompting strategy. Always establish which model is loaded before writing a prompt.- Always identify the underlying model first —
mistral,llama3,gemma2,qwen2.5,deepseek-r1all have different optimal strategies - System prompt is the most impactful lever — Ollama models respond strongly to a well-crafted system prompt; invest here first
- Temperature 0.1 for coding tasks — deterministic output for code generation, debugging, and data extraction
- Temperature 0.7–0.8 for creative tasks — writing, brainstorming, ideation, dialogue
Llama / Mistral
Open-weight models like Llama 3.x and Mistral respond best to simple, direct language. Claude-style XML scaffolding and GPT-style elaborate output contracts add noise without benefit here.- Shorter, simpler prompts — concise beats comprehensive; long prompts degrade instruction-following
- Simple flat structure — avoid nested XML, elaborate JSON schemas, or multi-section templates
- More explicit than Claude or GPT — don’t rely on inference; state each requirement directly
- Avoid implicit assumptions — what Claude infers from context, Llama needs stated explicitly
DeepSeek-R1
DeepSeek-R1 is a reasoning-native model that produces its internal reasoning chain inside<think> tags before delivering the final answer. Like o3, it has built-in adaptive thinking that is degraded by explicit CoT scaffolding.
- Reasoning-native — no CoT instructions — same rule as o3; short, clean goal statements only
- Short, clean instructions — let the model’s internal reasoning do the work
- Outputs appear in
<think>tags — final answer follows the closing tag; parse accordingly if using the API - Strong at math, logic, and complex reasoning — well-suited for multi-step analytical tasks
MiniMax M3 / M2.7
MiniMax models are OpenAI-API-compatible and support very long contexts. M2.7 has a 1M token context window. There is a strict temperature constraint that applies across both models.- OpenAI-compatible API — works with the same message format as GPT; system/user/assistant roles apply
- 1M context window on M2.7 — safe for very large document analysis and codebase tasks
- Temperature must be between 0 and 1 — values outside this range will error; do not use 0 as a floor if the API requires a positive value
- Strong at code, JSON, and multi-step workflows — reliable structured output and sequential task handling
- Treat prompting style like GPT-5.x — explicit output contract, constrained verbosity, compact structure
At a Glance: CoT Rules by Model
| Model | CoT Instructions | Why |
|---|---|---|
| Claude 4.x | ❌ Never | Adaptive thinking built-in |
| GPT-5.x | ✅ Optional | Improves complex multi-step tasks |
| o3 / o4-mini | ❌ Never | Degrades reasoning output |
| Gemini 2.x/3 | ✅ Optional | Helpful for structured analysis |
| Qwen 2.5 | ✅ Optional | Improves accuracy on complex tasks |
| Qwen3 Thinking | ❌ Never | Reasoning-native mode |
| Qwen3 Non-Thinking | ✅ Optional | Standard model behavior |
| Llama / Mistral | ✅ Optional | Use sparingly; keep prompts short |
| DeepSeek-R1 | ❌ Never | Reasoning-native; uses <think> tags |
| MiniMax M3/M2.7 | ✅ Optional | Treat like GPT |