Overview
PDD commands can run either in cloud mode or local mode. Cloud mode is the default and requires no API key management. Local mode uses the--local flag and requires you to provide API keys for one or more LLM providers.
Cloud mode (default)
GitHub SSO authentication. No API keys needed. Access to powerful models with automatic selection, cost optimization, and shared grounding from the PDD community.
Local mode (--local)
API keys required per provider. Uses LiteLLM to route to OpenAI, Anthropic, Google, or any supported provider. Full control over model selection.
Cloud mode
Cloud mode is the default for all PDD commands. It provides:- No need to manage API keys locally
- Access to more powerful models
- Shared examples and improvements across the PDD community
- Automatic updates and improvements
- Better cost optimization
Authentication flow
Cloud mode uses GitHub Single Sign-On (SSO) for authentication.Trigger authentication
On first use of a cloud command, PDD automatically opens your default browser to the GitHub login page.
Log in with GitHub
Sign in with your GitHub account and authorize PDD Cloud to access your GitHub profile.
Automated grounding
In cloud mode, PDD uses Automated Grounding to prevent implementation drift. When you runpdd generate, the system:
- Embeds your prompt into a vector
- Searches for similar prompts in the cloud database (cosine similarity)
- Auto-injects the closest (prompt, code) pair as a few-shot example
Local mode
Use the--local flag to run commands without cloud connectivity. Local mode uses LiteLLM to route requests to supported providers.
API key setup
Set environment variables for the providers you want to use:.bashrc, .zshrc, or equivalent shell profile for persistence.
When keys are missing, PDD prompts for them interactively and securely stores them in your local .env file.
pdd setup scans for API keys across .env, ~/.pdd/api-env.*, and the shell environment. Run it after installation to configure keys, select models, and test connectivity:LiteLLM integration
Local mode uses LiteLLM (version 1.75.5 or higher) which provides:- Support for multiple model providers: OpenAI, Anthropic, Google/Vertex AI, and more
- Automatic model selection based on strength settings
- Response caching for improved performance
- Smart token usage tracking and cost estimation
provider/model_name:
Model configuration
PDD selects models using a CSV configuration file (llm_model.csv). This file is loaded from the first location that exists, in order of precedence:
| Precedence | Location | Description |
|---|---|---|
| 1 (highest) | ~/.pdd/llm_model.csv | User-specific configuration |
| 2 | <PROJECT_ROOT>/.pdd/llm_model.csv | Project-specific configuration |
| 3 (lowest) | Bundled with PDD | Default fallback |
CSV columns
| Column | Description |
|---|---|
provider | LLM provider (e.g., openai, anthropic, google) |
model | LiteLLM model identifier (e.g., gpt-4, claude-3-opus-20240229) |
input | Cost per million input tokens (USD) |
output | Cost per million output tokens (USD) |
coding_arena_elo | ELO rating for coding ability — used for --strength selection |
api_key | Environment variable name for the required API key |
structured_output | Whether the model supports structured JSON output |
reasoning_type | Reasoning capability: none, budget, or effort |
pdd/data/llm_model.csv in the repository.
LLM strength
The--strength flag controls which model tier is selected based on ELO rating. It applies globally to any command:
| Value | Behavior |
|---|---|
0.0 | Cheapest available model (lowest ELO among available keys) |
0.5 | Default base model |
1.0 | Most powerful model (highest ELO rating) |
0.0 and 1.0 select proportionally within the available model range. PDD filters the model list to only those for which you have a valid API key, then selects by ELO position within that filtered set.
LLM reasoning allocation
The--time flag controls the reasoning allocation for models that support extended reasoning capabilities (e.g., models with reasoning tokens or discrete effort levels):
| Value | Behavior |
|---|---|
0.0 | Minimum reasoning allocation |
0.25 | Default |
1.0 | Maximum available reasoning tokens or highest effort level |
1.0 utilizes the maximum. For models with discrete effort levels, 1.0 corresponds to the highest effort level. Values between 0.0 and 1.0 scale proportionally.
The
--time flag only affects models that have a reasoning_type of budget or effort in llm_model.csv. For models with reasoning_type: none, the flag has no effect.Additional model controls
Temperature (--temperature)
Temperature (--temperature)
Controls the randomness of model output. Default is
0.0 for deterministic, reproducible generation. Higher values increase diversity but may produce less coherent results.Cost tracking (--output-cost)
Cost tracking (--output-cost)
Enable cost tracking and write usage details to a CSV file. Tracks timestamp, model, command, cost (USD), input files, and output files for each operation.Set
PDD_OUTPUT_COST_PATH to configure a default output path.Verbose output (--verbose)
Verbose output (--verbose)
Includes token count and context window usage for each LLM call. Useful for diagnosing context size issues or optimizing prompt efficiency.
Budget limits (--budget)
Budget limits (--budget)
Set a maximum total cost for the entire operation. PDD stops before exceeding the budget.
Choosing between cloud and local
- Use cloud mode when
- Use local mode when
- You want zero API key management
- You want access to the most powerful models without provider accounts
- You want automated grounding from the PDD community
- You are getting started with PDD for the first time