Heimdall uses large language models (LLMs) for threat modeling, vulnerability discovery, and code analysis. You bring your own API keys (BYOK).
Supported Providers
Heimdall supports three AI providers:
| Provider | Models | Use Case |
|---|
| Anthropic (Claude) | Claude Sonnet 4, Claude Opus | Recommended for security analysis (native tool use) |
| OpenAI | GPT-4o, o1, o3-mini | Alternative with function calling |
| Ollama | Llama 3.3, Mistral, DeepSeek, Qwen | Local inference (no API key required) |
Quick Start
Set at least one provider in your .env file:
# Anthropic (recommended)
ANTHROPIC_API_KEY=sk-ant-...
# Or OpenAI
OPENAI_API_KEY=sk-...
# Or Ollama (local)
OLLAMA_URL=http://localhost:11434
# Optional: override default model
DEFAULT_AI_MODEL=claude-sonnet-4-20250514
Restart Heimdall:
docker compose restart heimdall
Anthropic (Claude)
Getting an API Key
- Sign up at console.anthropic.com
- Go to API Keys → Create Key
- Copy the key (starts with
sk-ant-)
Configuration
Add to .env:
ANTHROPIC_API_KEY=sk-ant-api03-...
DEFAULT_AI_MODEL=claude-sonnet-4-20250514
Supported Models
| Model ID | Description | Context | Cost (per 1M tokens) |
|---|
claude-sonnet-4-20250514 | Balanced performance (default) | 200k | Input: 3,Output:15 |
claude-opus-4-20250514 | Highest capability | 200k | Input: 15,Output:75 |
claude-3-5-sonnet-20241022 | Previous generation | 200k | Input: 3,Output:15 |
Claude is recommended because it supports native tool use format, which the Hunt agent relies on for code analysis.
Rate Limits
Anthropic enforces per-minute rate limits:
- Free tier: 5 RPM (requests per minute)
- Paid tier: 50-1000+ RPM (depending on usage tier)
If you hit rate limits, configure a fallback provider (see Fallback Chain).
OpenAI
Getting an API Key
- Sign up at platform.openai.com
- Go to API Keys → Create new secret key
- Copy the key (starts with
sk-)
Configuration
Add to .env:
OPENAI_API_KEY=sk-...
DEFAULT_AI_MODEL=gpt-4o
Supported Models
| Model ID | Description | Context | Cost (per 1M tokens) |
|---|
gpt-4o | Optimized GPT-4 (default) | 128k | Input: 5,Output:15 |
o1-preview | Reasoning model | 128k | Input: 15,Output:60 |
o3-mini | Fast, cost-efficient | 128k | Input: 1,Output:4 |
Rate Limits
- Free tier: 3 RPM
- Tier 1 ($5+ spent): 500 RPM
- Tier 5 ($1000+ spent): 10,000 RPM
See OpenAI rate limits for details.
Ollama (Local)
Ollama runs LLMs locally — no API key or internet connection required.
Installation
macOS / Linux:
curl -fsSL https://ollama.com/install.sh | sh
Docker:
docker run -d -p 11434:11434 --name ollama ollama/ollama
Pull a Model
Recommended models for security analysis:
llama3.3 (70B parameters, best quality)
deepseek-coder-v2 (optimized for code)
qwen2.5-coder (fast, lightweight)
Configuration
Add to .env:
OLLAMA_URL=http://localhost:11434
DEFAULT_AI_MODEL=llama3.3
For Docker deployments, use the container name:
OLLAMA_URL=http://ollama:11434
Hardware Requirements
| Model | Parameters | RAM | VRAM (GPU) |
|---|
llama3.3 | 70B | 64 GB | 24 GB |
qwen2.5-coder:32b | 32B | 32 GB | 16 GB |
deepseek-coder-v2:16b | 16B | 16 GB | 8 GB |
Large models like Llama 3.3 (70B) require significant hardware. For smaller deployments, use quantized models (e.g., llama3.3:8b-q4_0).
Fallback Provider Chain
When multiple providers are configured, Heimdall automatically chains them in priority order:
Priority: Anthropic → OpenAI → Ollama
If the primary provider fails with a retryable error, the request falls through to the next provider:
Retryable Errors
- HTTP 429: Rate limit exceeded
- HTTP 500/502/503/529: Server errors
- Network errors: Connection timeout, DNS failure
- Billing errors: Insufficient credits, quota exceeded
Non-Retryable Errors
These propagate immediately (no fallback):
- HTTP 401: Invalid API key
- HTTP 400: Malformed request
- HTTP 404: Model not found
Example Configuration
# Primary: Claude (fastest, best quality)
ANTHROPIC_API_KEY=sk-ant-...
# Fallback 1: OpenAI (if Claude rate-limited)
OPENAI_API_KEY=sk-...
# Fallback 2: Ollama (if both cloud providers fail)
OLLAMA_URL=http://localhost:11434
Result: Every LLM call attempts Claude first. If it fails with HTTP 429, OpenAI is tried. If OpenAI also fails, Ollama is used as a last resort.
Observability
Every LLM call records which provider and model was actually used:
SELECT provider, model, tool_name, created_at
FROM agent_tool_calls
WHERE scan_id = '<your_scan_id>'
ORDER BY created_at DESC;
This allows you to audit which provider served each request — especially useful when fallback kicks in.
See src/ai/fallback.rs for the implementation.
Model Selection
Heimdall infers the provider from the model name:
| Model Name Contains | Provider |
|---|
claude | Anthropic |
gpt, o1, o3, o4 | OpenAI |
llama, mistral, qwen, deepseek, phi, gemma, codellama | Ollama |
Override Default Model
Set DEFAULT_AI_MODEL in .env:
If the model doesn’t match the provider, Heimdall falls back to a safe default:
- Anthropic →
claude-sonnet-4-20250514
- OpenAI →
gpt-4o
- Ollama →
llama3.3
See src/ai/mod.rs:98-106 for the fallback logic.
BYOK (Bring Your Own Key) Approach
Heimdall never provides or manages API keys. You maintain full control:
- User-level keys: Set via Settings UI after registration
- System-level keys: Set in
.env for all users
User keys override system keys.
Encryption at Rest
API keys stored in the database are encrypted with AES-256-GCM:
# Generate a 32-byte encryption key
openssl rand -hex 32
# Add to .env
ENCRYPTION_KEY=<generated_key>
If ENCRYPTION_KEY is not set, keys are stored as hex-encoded plaintext. Always configure encryption for production.
See src/crypto.rs for the encryption implementation.
Testing Connections
Verify your provider configuration:
Via Settings UI
- Navigate to Settings → AI Providers
- Click Test Connection next to each provider
- Successful test shows a green checkmark
Via API
curl -X POST http://localhost:8080/api/settings/test-connection \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your_session_token>" \
-d '{"provider": "anthropic", "api_key": "sk-ant-..."}'
Response:
{
"status": "ok",
"data": {
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"working": true
}
}
Troubleshooting
Symptom: Scans fail with “No AI provider configured”
Solution: Set at least one provider in .env and restart Heimdall.
Invalid API Key
Symptom: HTTP 401 errors in logs
Solutions:
- Verify the key format:
- Anthropic:
sk-ant-...
- OpenAI:
sk-...
- Check for extra whitespace in
.env
- Regenerate the key in your provider console
Rate Limit Exceeded
Symptom: HTTP 429 errors, scans pause frequently
Solutions:
- Upgrade to a paid tier with higher limits
- Configure a fallback provider:
ANTHROPIC_API_KEY=sk-ant-... # Primary
OPENAI_API_KEY=sk-... # Fallback
- Reduce concurrent scans
Ollama Connection Refused
Symptom: “Failed to connect to Ollama at http://localhost:11434”
Solutions:
- Verify Ollama is running:
curl http://localhost:11434/api/tags
- Check Docker networking (if using containers):
# Use container name instead of localhost
OLLAMA_URL=http://ollama:11434
- Ensure the model is pulled:
Model Not Found
Symptom: “Model ‘xyz’ not found”
Solutions:
- Verify the model ID is correct (check provider docs)
- For Ollama, pull the model:
- For cloud providers, ensure your API tier has access to the model
Cost Optimization
- Use Claude Sonnet (not Opus) for most scans — it’s 5× cheaper
- Configure Ollama as fallback to avoid overage charges
- Monitor usage:
SELECT provider, model, COUNT(*) as calls, AVG(input_tokens) as avg_input, AVG(output_tokens) as avg_output
FROM agent_tool_calls
WHERE created_at > NOW() - INTERVAL '30 days'
GROUP BY provider, model;
API Reference
| Endpoint | Method | Description |
|---|
/api/settings | GET | Get AI provider status |
/api/settings/api-keys | POST | Store user API key |
/api/settings/api-keys/{id} | DELETE | Delete user API key |
/api/settings/test-connection | POST | Test provider connection |
See API Reference for full details.