Skip to main content
Heimdall uses large language models (LLMs) for threat modeling, vulnerability discovery, and code analysis. You bring your own API keys (BYOK).

Supported Providers

Heimdall supports three AI providers:
ProviderModelsUse Case
Anthropic (Claude)Claude Sonnet 4, Claude OpusRecommended for security analysis (native tool use)
OpenAIGPT-4o, o1, o3-miniAlternative with function calling
OllamaLlama 3.3, Mistral, DeepSeek, QwenLocal inference (no API key required)

Quick Start

Set at least one provider in your .env file:
# Anthropic (recommended)
ANTHROPIC_API_KEY=sk-ant-...

# Or OpenAI
OPENAI_API_KEY=sk-...

# Or Ollama (local)
OLLAMA_URL=http://localhost:11434

# Optional: override default model
DEFAULT_AI_MODEL=claude-sonnet-4-20250514
Restart Heimdall:
docker compose restart heimdall

Anthropic (Claude)

Getting an API Key

  1. Sign up at console.anthropic.com
  2. Go to API KeysCreate Key
  3. Copy the key (starts with sk-ant-)

Configuration

Add to .env:
ANTHROPIC_API_KEY=sk-ant-api03-...
DEFAULT_AI_MODEL=claude-sonnet-4-20250514

Supported Models

Model IDDescriptionContextCost (per 1M tokens)
claude-sonnet-4-20250514Balanced performance (default)200kInput: 3,Output:3, Output: 15
claude-opus-4-20250514Highest capability200kInput: 15,Output:15, Output: 75
claude-3-5-sonnet-20241022Previous generation200kInput: 3,Output:3, Output: 15
Claude is recommended because it supports native tool use format, which the Hunt agent relies on for code analysis.

Rate Limits

Anthropic enforces per-minute rate limits:
  • Free tier: 5 RPM (requests per minute)
  • Paid tier: 50-1000+ RPM (depending on usage tier)
If you hit rate limits, configure a fallback provider (see Fallback Chain).

OpenAI

Getting an API Key

  1. Sign up at platform.openai.com
  2. Go to API KeysCreate new secret key
  3. Copy the key (starts with sk-)

Configuration

Add to .env:
OPENAI_API_KEY=sk-...
DEFAULT_AI_MODEL=gpt-4o

Supported Models

Model IDDescriptionContextCost (per 1M tokens)
gpt-4oOptimized GPT-4 (default)128kInput: 5,Output:5, Output: 15
o1-previewReasoning model128kInput: 15,Output:15, Output: 60
o3-miniFast, cost-efficient128kInput: 1,Output:1, Output: 4

Rate Limits

  • Free tier: 3 RPM
  • Tier 1 ($5+ spent): 500 RPM
  • Tier 5 ($1000+ spent): 10,000 RPM
See OpenAI rate limits for details.

Ollama (Local)

Ollama runs LLMs locally — no API key or internet connection required.

Installation

macOS / Linux:
curl -fsSL https://ollama.com/install.sh | sh
Docker:
docker run -d -p 11434:11434 --name ollama ollama/ollama

Pull a Model

ollama pull llama3.3
Recommended models for security analysis:
  • llama3.3 (70B parameters, best quality)
  • deepseek-coder-v2 (optimized for code)
  • qwen2.5-coder (fast, lightweight)

Configuration

Add to .env:
OLLAMA_URL=http://localhost:11434
DEFAULT_AI_MODEL=llama3.3
For Docker deployments, use the container name:
OLLAMA_URL=http://ollama:11434

Hardware Requirements

ModelParametersRAMVRAM (GPU)
llama3.370B64 GB24 GB
qwen2.5-coder:32b32B32 GB16 GB
deepseek-coder-v2:16b16B16 GB8 GB
Large models like Llama 3.3 (70B) require significant hardware. For smaller deployments, use quantized models (e.g., llama3.3:8b-q4_0).

Fallback Provider Chain

When multiple providers are configured, Heimdall automatically chains them in priority order: Priority: Anthropic → OpenAI → Ollama If the primary provider fails with a retryable error, the request falls through to the next provider:

Retryable Errors

  • HTTP 429: Rate limit exceeded
  • HTTP 500/502/503/529: Server errors
  • Network errors: Connection timeout, DNS failure
  • Billing errors: Insufficient credits, quota exceeded

Non-Retryable Errors

These propagate immediately (no fallback):
  • HTTP 401: Invalid API key
  • HTTP 400: Malformed request
  • HTTP 404: Model not found

Example Configuration

# Primary: Claude (fastest, best quality)
ANTHROPIC_API_KEY=sk-ant-...

# Fallback 1: OpenAI (if Claude rate-limited)
OPENAI_API_KEY=sk-...

# Fallback 2: Ollama (if both cloud providers fail)
OLLAMA_URL=http://localhost:11434
Result: Every LLM call attempts Claude first. If it fails with HTTP 429, OpenAI is tried. If OpenAI also fails, Ollama is used as a last resort.

Observability

Every LLM call records which provider and model was actually used:
SELECT provider, model, tool_name, created_at 
FROM agent_tool_calls 
WHERE scan_id = '<your_scan_id>'
ORDER BY created_at DESC;
This allows you to audit which provider served each request — especially useful when fallback kicks in. See src/ai/fallback.rs for the implementation.

Model Selection

Heimdall infers the provider from the model name:
Model Name ContainsProvider
claudeAnthropic
gpt, o1, o3, o4OpenAI
llama, mistral, qwen, deepseek, phi, gemma, codellamaOllama

Override Default Model

Set DEFAULT_AI_MODEL in .env:
DEFAULT_AI_MODEL=gpt-4o
If the model doesn’t match the provider, Heimdall falls back to a safe default:
  • Anthropic → claude-sonnet-4-20250514
  • OpenAI → gpt-4o
  • Ollama → llama3.3
See src/ai/mod.rs:98-106 for the fallback logic.

BYOK (Bring Your Own Key) Approach

Heimdall never provides or manages API keys. You maintain full control:
  • User-level keys: Set via Settings UI after registration
  • System-level keys: Set in .env for all users
User keys override system keys.

Encryption at Rest

API keys stored in the database are encrypted with AES-256-GCM:
# Generate a 32-byte encryption key
openssl rand -hex 32

# Add to .env
ENCRYPTION_KEY=<generated_key>
If ENCRYPTION_KEY is not set, keys are stored as hex-encoded plaintext. Always configure encryption for production.
See src/crypto.rs for the encryption implementation.

Testing Connections

Verify your provider configuration:

Via Settings UI

  1. Navigate to SettingsAI Providers
  2. Click Test Connection next to each provider
  3. Successful test shows a green checkmark

Via API

curl -X POST http://localhost:8080/api/settings/test-connection \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <your_session_token>" \
  -d '{"provider": "anthropic", "api_key": "sk-ant-..."}'
Response:
{
  "status": "ok",
  "data": {
    "provider": "anthropic",
    "model": "claude-sonnet-4-20250514",
    "working": true
  }
}

Troubleshooting

No AI Provider Configured

Symptom: Scans fail with “No AI provider configured” Solution: Set at least one provider in .env and restart Heimdall.

Invalid API Key

Symptom: HTTP 401 errors in logs Solutions:
  1. Verify the key format:
    • Anthropic: sk-ant-...
    • OpenAI: sk-...
  2. Check for extra whitespace in .env
  3. Regenerate the key in your provider console

Rate Limit Exceeded

Symptom: HTTP 429 errors, scans pause frequently Solutions:
  1. Upgrade to a paid tier with higher limits
  2. Configure a fallback provider:
    ANTHROPIC_API_KEY=sk-ant-...  # Primary
    OPENAI_API_KEY=sk-...         # Fallback
    
  3. Reduce concurrent scans

Ollama Connection Refused

Symptom: “Failed to connect to Ollama at http://localhost:11434 Solutions:
  1. Verify Ollama is running:
    curl http://localhost:11434/api/tags
    
  2. Check Docker networking (if using containers):
    # Use container name instead of localhost
    OLLAMA_URL=http://ollama:11434
    
  3. Ensure the model is pulled:
    ollama pull llama3.3
    

Model Not Found

Symptom: “Model ‘xyz’ not found” Solutions:
  1. Verify the model ID is correct (check provider docs)
  2. For Ollama, pull the model:
    ollama pull <model_name>
    
  3. For cloud providers, ensure your API tier has access to the model

Cost Optimization

  1. Use Claude Sonnet (not Opus) for most scans — it’s 5× cheaper
  2. Configure Ollama as fallback to avoid overage charges
  3. Monitor usage:
    SELECT provider, model, COUNT(*) as calls, AVG(input_tokens) as avg_input, AVG(output_tokens) as avg_output
    FROM agent_tool_calls
    WHERE created_at > NOW() - INTERVAL '30 days'
    GROUP BY provider, model;
    

API Reference

EndpointMethodDescription
/api/settingsGETGet AI provider status
/api/settings/api-keysPOSTStore user API key
/api/settings/api-keys/{id}DELETEDelete user API key
/api/settings/test-connectionPOSTTest provider connection
See API Reference for full details.

Build docs developers (and LLMs) love