Overview
Skyvern requires a Large Language Model (LLM) provider to power its intelligent browser automation. This guide covers configuration for all supported LLM providers.Supported LLM Providers
| Provider | Supported Models | Best For |
|---|---|---|
| OpenAI | GPT-5, GPT-5.2, GPT-4.1, o3, o4-mini | Best performance, latest models |
| Anthropic | Claude 4 (Sonnet, Opus), Claude 4.5 (Haiku, Sonnet, Opus) | Strong reasoning, vision support |
| Azure OpenAI | Any GPT models (GPT-4o recommended) | Enterprise deployments |
| AWS Bedrock | Claude 3.5, 3.7, 4, 4.5 (Sonnet, Opus) | AWS-integrated environments |
| Gemini | Gemini 2.5 Pro/Flash, 3 Pro/Flash | Google ecosystem |
| Ollama | Any locally hosted model | Local/offline deployments |
| OpenRouter | Any available models | Multi-model flexibility |
| Groq | llama-3.1-8b-instant | Ultra-fast inference |
| OpenAI-Compatible | Custom endpoints via liteLLM | Self-hosted models |
Quick Setup with CLI
The fastest way to configure LLMs is using the Skyvern CLI:- Prompt you to select an LLM provider
- Request necessary API keys
- Generate the
.envfile with correct configuration
Provider Configuration
OpenAI
Environment Variables:OPENAI_GPT5- Latest GPT-5 modelOPENAI_GPT5_2- GPT-5.2 variantOPENAI_GPT4_1- GPT-4.1OPENAI_O3- o3 modelOPENAI_O4_MINI- o4-mini (cost-effective)OPENAI_GPT4O- GPT-4o (multimodal)
Anthropic
Environment Variables:ANTHROPIC_CLAUDE4.5_OPUS- Highest capabilityANTHROPIC_CLAUDE4.5_SONNET- Balanced performanceANTHROPIC_CLAUDE4_OPUS- Claude 4 OpusANTHROPIC_CLAUDE4_SONNET- Claude 4 SonnetANTHROPIC_CLAUDE3.7_SONNET- Claude 3.7
Azure OpenAI
Setup Steps:- Login to Azure Portal
- Create an Azure Resource Group
- Create an OpenAI resource in the Resource Group
- Open “Azure AI Foundry” portal
- Navigate to “Shared Resources” > “Deployments”
- Deploy a base model (e.g., GPT-4o)
- Note the deployment name, API key, and endpoint
AWS Bedrock
Setup Steps:- Create AWS IAM User
- Assign “AmazonBedrockFullAccess” policy
- Generate Access Key and Secret Key
- In Amazon Bedrock console, go to “Model Access”
- Enable “Claude 3.5 Sonnet v2” (or desired model)
BEDROCK_ANTHROPIC_CLAUDE4.5_OPUS_INFERENCE_PROFILEBEDROCK_ANTHROPIC_CLAUDE4.5_SONNET_INFERENCE_PROFILEBEDROCK_ANTHROPIC_CLAUDE4_OPUS_INFERENCE_PROFILEBEDROCK_ANTHROPIC_CLAUDE3.5_SONNET- v2 modelBEDROCK_ANTHROPIC_CLAUDE3.5_SONNET_V1- v1 model
Gemini
Environment Variables:GEMINI_2.5_PRO- Latest Pro modelGEMINI_2.5_FLASH- Fast inferenceGEMINI_2.5_PRO_PREVIEW- Preview versionGEMINI_2.5_FLASH_PREVIEW- Preview flashGEMINI_3.0_FLASH- Gemini 3.0
Ollama (Local Models)
Prerequisites:- Install Ollama: https://ollama.ai
- Pull a model:
ollama pull qwen2.5:7b-instruct - Start Ollama server (usually on port 11434)
OpenRouter
Environment Variables:Groq
Environment Variables:Novita AI
Environment Variables:Volcengine (ByteDance Doubao)
Environment Variables:OpenAI-Compatible (Custom Endpoints)
For self-hosted models or custom endpoints that follow OpenAI’s API format: Environment Variables:- Together AI:
https://api.together.xyz/v1 - Local vLLM:
http://localhost:8000/v1 - Local Ollama (via liteLLM):
http://localhost:11434/v1
Advanced Configuration
Primary and Secondary LLM
Skyvern supports using a cheaper/faster secondary LLM for smaller tasks:SECONDARY_LLM_KEY is empty, Skyvern uses the primary LLM for all tasks.
Max Tokens Override
For OpenRouter and Ollama, you can override max tokens:Multiple Providers
You can enable multiple providers and switch between them:Verification
After configuration, verify your LLM setup:-
Start Skyvern:
-
Check logs for LLM initialization:
-
Look for:
- Test with a simple task in the UI
Troubleshooting
”LLM provider not enabled”
EnsureENABLE_<PROVIDER>=true is set:
“Invalid API key”
Verify your API key:- No extra spaces or quotes
- Key has correct permissions
- Key is not expired
”Model not found”
Check:- Model name matches supported LLM keys exactly
- Provider is enabled
- For Azure: deployment name matches
AZURE_DEPLOYMENT
High Costs
Optimize costs by:- Using
SECONDARY_LLM_KEYfor simple tasks - Choosing cost-effective models (GPT-4o-mini, Claude 3.5 Haiku)
- Using Ollama for local inference (free)
Slow Performance
Improve speed with:- Faster models (GPT-4o-mini, Gemini Flash)
- Groq for ultra-fast inference
- Local Ollama with GPU acceleration
Recommended Configurations
Development
Production
Enterprise (Azure)
Local/Offline
Next Steps
Environment Variables
Complete configuration reference
Docker Setup
Deploy with Docker Compose
Storage Configuration
Configure S3/Azure storage