Skip to main content
Fallbacks improve the reliability of your AI application by automatically routing to a secondary provider or model when a primary request fails. If the first target returns an error, the gateway tries the next target in the list, and so on until a request succeeds or all targets are exhausted.

Basic configuration

Set strategy.mode to "fallback" and list your targets in priority order. The first target is tried first; subsequent targets are used only if the previous one fails.
{
  "strategy": {"mode": "fallback"},
  "targets": [
    {
      "provider": "openai",
      "override_params": {"model": "gpt-4o"}
    },
    {
      "provider": "anthropic",
      "override_params": {"model": "claude-3-5-sonnet-20241022"}
    }
  ]
}
from portkey_ai import Portkey

client = Portkey(
    base_url="http://localhost:8787/v1",
    config={
      "strategy": {"mode": "fallback"},
      "targets": [
        {"provider": "openai", "api_key": "sk-...",
         "override_params": {"model": "gpt-4o"}},
        {"provider": "anthropic", "api_key": "sk-ant-...",
         "override_params": {"model": "claude-3-5-sonnet-20241022"}}
      ]
    }
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

Triggering conditions

By default, a fallback is triggered when the current target returns any error response. You can restrict fallbacks to specific HTTP status codes using strategy.on_status_codes:
{
  "strategy": {
    "mode": "fallback",
    "on_status_codes": [429, 500, 502, 503, 504]
  },
  "targets": [
    {"provider": "openai", "override_params": {"model": "gpt-4o"}},
    {"provider": "anthropic", "override_params": {"model": "claude-3-5-sonnet-20241022"}}
  ]
}
When on_status_codes is set, errors with codes not in the list propagate immediately to the caller — the gateway will not attempt the next target.

Fallback chains

You can chain more than two targets to create a multi-level fallback:
{
  "strategy": {"mode": "fallback"},
  "targets": [
    {"provider": "openai", "override_params": {"model": "gpt-4o"}},
    {"provider": "anthropic", "override_params": {"model": "claude-3-5-sonnet-20241022"}},
    {"provider": "mistral-ai", "override_params": {"model": "mistral-large-latest"}},
    {"provider": "groq", "override_params": {"model": "llama-3.1-70b-versatile"}}
  ]
}
The gateway works through targets in order until one succeeds. If all targets fail, the last error is returned to the caller.

Fallback across API keys

Fallbacks are not limited to different providers. You can fall back across multiple API keys for the same provider:
{
  "strategy": {"mode": "fallback"},
  "targets": [
    {"provider": "openai", "api_key": "sk-primary-..."},
    {"provider": "openai", "api_key": "sk-backup-..."}
  ]
}

Combining with retries

Retries and fallbacks work together. The gateway first exhausts all retry attempts for the current target, then moves to the next fallback target.
{
  "strategy": {"mode": "fallback"},
  "retry": {"attempts": 2},
  "targets": [
    {"provider": "openai", "override_params": {"model": "gpt-4o"}},
    {"provider": "anthropic", "override_params": {"model": "claude-3-5-sonnet-20241022"}}
  ]
}
With this config, the gateway makes up to 3 total attempts to OpenAI (1 original + 2 retries) before falling back to Anthropic.
Apply per-target retry settings to fine-tune how many times each provider is retried before the next fallback is triggered.

Response headers

The gateway sets response headers so you can observe fallback behavior:
HeaderDescription
x-portkey-last-used-option-indexIndex of the target that produced the final response
x-portkey-last-used-option-paramsParameters of the last-used target
x-portkey-retry-attempt-countNumber of retry attempts made

Build docs developers (and LLMs) love