Use this file to discover all available pages before exploring further.
Archestra acts as a security proxy between your AI applications and LLM providers, supporting a broad range of cloud APIs, enterprise-managed services, and self-hosted inference engines. Each provider has a dedicated proxy route with its own base URL and authentication method. For providers that require cloud IAM — Vertex AI and AWS Bedrock — Archestra integrates with workload identity so no API keys are needed at runtime.
The Model Router exposes a single OpenAI-compatible interface that can route to models across all configured providers. Use it to switch between providers without changing client code.
Use provider-qualified model IDs for deterministic routing: openai:gpt-5.4, anthropic:claude-opus-4-6-20250918, groq:llama-3.1-8b-instant, bedrock:amazon.nova-pro-v1:0. Call GET /v1/model-router/{llm-proxy-id}/models to list all available model IDs for your configured providers.
Providers that use native request formats — Anthropic, Bedrock, Gemini, and Cohere — are translated between OpenAI and provider-native formats by the Model Router. Translation is text-first; non-text content parts such as image_url are dropped for Anthropic, Gemini, and Cohere routes (Bedrock supports base64 data URL images).
OpenAI streaming responses require your cloud provider’s load balancer to support long-lived connections. See the Cloud Provider Configuration docs for streaming timeout settings.
Anthropic
Archestra proxies the Anthropic Messages API. Claude models on Microsoft Azure Foundry are also supported via a separate configuration.
Claude models deployed in Microsoft Foundry use the Anthropic Messages API at https://<resource>.services.ai.azure.com/anthropic. Set ARCHESTRA_ANTHROPIC_BASE_URL to that /anthropic base URL.For keyless Microsoft Entra ID authentication, set ARCHESTRA_ANTHROPIC_AZURE_FOUNDRY_ENTRA_ID_ENABLED=true. Archestra will send a bearer token scoped to https://ai.azure.com/.default.
Claude Foundry deployments must exist in Azure before requests will work. Azure requires Anthropic deployment metadata (industry, organizationName, countryCode) when creating Claude deployments. Microsoft lists additional prerequisites: a paid eligible Azure subscription, a supported region (East US2, Sweden Central), Azure Marketplace access, and Contributor or Owner role on the resource group.
Google Gemini and Vertex AI
Archestra supports both Google AI Studio (Gemini Developer API) and Vertex AI implementations of the Gemini API.
For non-GKE environments, Vertex AI supports several ADC authentication methods:
Service account key file: Set ARCHESTRA_GEMINI_VERTEX_AI_CREDENTIALS_FILE to the path of a JSON key file
Local development: Run gcloud auth application-default login
Cloud environments: Compute Engine, Cloud Run, and Cloud Functions automatically detect attached service accounts
AWS/Azure: Use workload identity federation for keyless cross-cloud authentication
Azure AI Foundry
Azure AI Foundry (formerly Azure OpenAI) provides enterprise-grade access to OpenAI models through Microsoft Azure with both API key and keyless Entra ID authentication.
Set ARCHESTRA_AZURE_OPENAI_ENTRA_ID_ENABLED=true, then create an Azure provider key in Archestra with no API key value and set its Base URL to the Azure resource endpoint. Archestra uses DefaultAzureCredential — deployment URLs use the https://cognitiveservices.azure.com/.default scope; Foundry v1 URLs use https://ai.azure.com/.default.
Assign the managed identity the Cognitive Services OpenAI User role for Azure OpenAI deployment URLs, or Cognitive Services User for Foundry Models.
AWS Bedrock
Archestra supports the Bedrock Converse and Converse Stream APIs with both API key and AWS IAM authentication. IAM authentication uses the AWS credential chain (IRSA, instance profiles, environment variables) via SigV4 signing — no API key needed.
Archestra uses the Bedrock ListInferenceProfiles API to discover available models, so only models with inference profiles configured in your AWS account appear in the picker.Filter models using environment variables:
# Only Anthropic and Amazon modelsARCHESTRA_BEDROCK_ALLOWED_PROVIDERS=anthropic,amazon# Only US and global inference regionsARCHESTRA_BEDROCK_ALLOWED_INFERENCE_REGIONS=us,global
Common provider prefixes: anthropic, amazon, meta, mistral, deepseek, cohere, writer.
Known region prefixes: us, eu, ap, global.
Groq
Groq provides low-latency inference for popular open-source models through an OpenAI-compatible API.
xAI offers the Grok series of large language models with real-time information access and advanced reasoning capabilities via an OpenAI-compatible API.
OpenRouter exposes :free model variants at no cost. An API key is still required. Use openrouter/free as the model ID to route to OpenRouter’s built-in free model picker, which selects a free model per request based on the features needed (tool calling, structured outputs, image input). When an OpenRouter key is added to an organization with no default model configured, Archestra sets the Free Models Router as the org default.Get an API key from the OpenRouter dashboard.
Cerebras
Cerebras provides fast inference for open-source AI models through an OpenAI-compatible API.
DeepSeek models are accessible through AWS Bedrock inference profiles. Use the deepseek: provider prefix with the Model Router to route requests to DeepSeek models configured in your AWS account.Configure access by following the AWS Bedrock setup above and enabling the DeepSeek inference profiles in your AWS account. The ARCHESTRA_BEDROCK_ALLOWED_PROVIDERS variable can be set to include deepseek to surface only DeepSeek models in the picker.
MiniMax
MiniMax provides the MiniMax-M2 series with chain-of-thought reasoning capabilities and support for text and multi-turn conversations.
sonar-pro — Best for deep search-augmented generation
sonar — General-purpose search model
sonar-deep-research — Extended research tasks
Perplexity does not support external tool calling. It performs internal web searches and returns results in the response. Use Perplexity for search-augmented generation, not agentic workflows that require custom tools.
vLLM server base URL (e.g., http://localhost:8000/v1)
ARCHESTRA_CHAT_VLLM_API_KEY
No
API key for vLLM server (optional for most deployments)
The vLLM provider is only available when ARCHESTRA_VLLM_BASE_URL is set or a per-key base URL is configured in the UI. Per-key base URLs take precedence over the environment variable.
Ollama (Local)
Ollama is a local LLM runner for running open-source models on your own machine, ideal for local development, testing, and privacy-conscious deployments.
Go to Settings > LLM API Keys and add a new key with provider Ollama. Optionally set the Base URL if your Ollama server runs on a non-default host/port.
Ollama server base URL (default: http://localhost:11434/v1)
ARCHESTRA_CHAT_OLLAMA_API_KEY
No
API key for Ollama (optional, for Ollama Cloud)
Ollama is enabled by default with a base URL of http://localhost:11434/v1. Models must be pulled with ollama pull <model-name> before they can be used through Archestra.