Proxy mode

The Drako proxy is an out-of-process enforcement layer. It intercepts LLM API calls at the network level — before they reach OpenAI or Anthropic — and runs the governance pipeline without any changes to your agent code. Because the proxy runs as a separate process, the agent cannot bypass it by modifying its own code at runtime.

How it works

Agent process
  └─ makes HTTP call to OpenAI/Anthropic
       └─ intercepted by Drako Proxy (localhost:8990)
            ├─ 1. Resolve target API (openai / anthropic)
            ├─ 2. Parse request body
            ├─ 3. Extract agent ID + tool name from headers/body
            ├─ 4. Run governance pipeline:
            │     ├─ ODD check  (forbidden/permitted tools)
            │     ├─ Magnitude  (rate limiting per agent)
            │     ├─ DLP        (PII/PCI scanning)
            │     └─ HITL       (human approval gate)
            ├─ 5. Forward to real API (if allowed)
            ├─ 6. Track cost from response usage
            └─ 7. Return response + governance headers

Blocked requests never reach the upstream API.

Quick start

Install proxy extras

pip install 'drako[proxy]'

Start the proxy

drako proxy start

The proxy listens on port 8990 by default. To use a custom port:

drako proxy start --port 9000

Point your agent at the proxy

export OPENAI_BASE_URL=http://localhost:8990/openai/v1
export ANTHROPIC_BASE_URL=http://localhost:8990/anthropic/v1

No code changes in your agent. All existing OpenAI/Anthropic SDK calls route through the proxy automatically when the base URL is set.

Identifying the calling agent

The proxy reads the X-Drako-Agent request header to identify which agent is making the call. This is required for per-agent ODD rules and magnitude limits.

import openai

client = openai.OpenAI(
    base_url="http://localhost:8990/openai/v1",
    default_headers={"X-Drako-Agent": "researcher"},
)

If the header is absent, the proxy logs as agent=unknown and agent-specific ODD checks are skipped.

What gets enforced

The proxy runs a subset of the full SDK enforcement pipeline:

ODD — Operational Design Domain

Checks whether the tool being called is permitted for this agent:

policies:
  odd:
    enforcement_mode: enforce   # audit | enforce | off
    agents:
      researcher:
        permitted_tools: [web_search, file_reader]
        forbidden_tools: [code_runner, shell_exec]

The tool name is extracted from the request body (function_call, tool_choice, or tool_calls in the last assistant message).

Magnitude — rate limiting

Tracks actions per agent per minute:

policies:
  magnitude:
    max_actions_per_minute: 60
    enforcement_mode: enforce   # audit | enforce

DLP — Data Loss Prevention

Scans the full request payload (all messages) for PII patterns:

Pattern	What it detects
`SSN`	`\d{3}-\d{2}-\d{4}`
`credit_card`	16-digit card numbers (with spaces/dashes)
`email_pii`	Email addresses

policies:
  dlp:
    mode: enforce   # audit | enforce | off

HITL — Human-in-the-Loop

Holds specific tool calls for human approval before forwarding to the upstream API:

policies:
  hitl:
    mode: enforce
    triggers:
      tools: [delete_file, send_email, execute_payment]
    timeout_action: reject   # reject | allow
    approval_timeout_minutes: 30

When a matching tool call arrives, the proxy returns HTTP 202 with an approval_id. The agent must poll the approval endpoint or implement a callback.

Blocked request responses

HTTP status	Cause
`403`	ODD violation (tool forbidden) or DLP violation (PII/PCI detected)
`429`	Magnitude limit exceeded
`202`	HITL pending — request held for human approval

Monitoring endpoints

The proxy exposes three read-only HTTP endpoints:

{
  "status": "healthy",
  "governance_level": "balanced",
  "targets": ["openai", "anthropic"],
  "audit_entries": 42
}

GET /status returns detailed proxy status including per-agent cost totals and action counts. Decisions are allowed, rejected, or pending_approval. Every proxied response also includes governance headers:

Header	Value
`X-Drako-Proxy`	`true`
`X-Drako-Latency-Ms`	Round-trip latency in milliseconds
`X-Drako-Cost-USD`	Estimated cost (when model pricing is known)

Deployment

Docker
Docker Compose
Helm

docker run -d \
  -p 8990:8990 \
  -v $(pwd)/.drako.yaml:/app/.drako.yaml:ro \
  -e OPENAI_API_KEY=$OPENAI_API_KEY \
  angelnicolasc/drako-proxy:latest

services:
  drako-proxy:
    image: angelnicolasc/drako-proxy:latest
    ports:
      - "8990:8990"
    volumes:
      - ./.drako.yaml:/app/.drako.yaml:ro
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}

  my-agent:
    build: .
    environment:
      - OPENAI_BASE_URL=http://drako-proxy:8990/openai/v1
    depends_on:
      - drako-proxy

The drako-proxy Helm chart is available in the deploy/helm/ directory. The chart deploys the proxy as a sidecar or standalone service with a ConfigMap for .drako.yaml.

helm install drako-proxy ./deploy/helm/drako-proxy \
  --set config.openaiApiKey=$OPENAI_API_KEY

Proxy vs SDK integration

	Proxy mode	SDK (`govern()`)
Code changes required	None	One line
Agent can bypass	No	No (middleware wraps every call)
Intent verification	Not available	Available
HITL approval UI	Bring your own callback	Built-in webhook + UI
Persistent audit trail	In-memory only (reset on restart)	Persistent, exportable
Multi-agent topology	Not available	Available
Best for	Quick adoption, legacy codebases	Full governance, compliance requirements

Proxy mode maintains state in memory only. Audit logs, session stats, and cost totals are reset on proxy restart. For persistent audit trails and the full 13-stage pipeline, use the SDK runtime with govern().

Get Started

Scanning

Runtime Enforcement

Configuration

Observability & Compliance

Integrations

How it works

Quick start

Identifying the calling agent

What gets enforced

Blocked request responses

Monitoring endpoints

Deployment

Proxy vs SDK integration

Build docs developers (and LLMs) love

Get Started

Scanning

Runtime Enforcement

Configuration

Observability & Compliance

Integrations

​How it works

​Quick start

​Identifying the calling agent

​What gets enforced

​Blocked request responses

​Monitoring endpoints

​Deployment

​Proxy vs SDK integration

Build docs developers (and LLMs) love

How it works

Quick start

Identifying the calling agent

What gets enforced

Blocked request responses

Monitoring endpoints

Deployment

Proxy vs SDK integration