AWS Bedrock

Overview

AWS Bedrock provides access to multiple foundation models including Anthropic Claude, Meta Llama, Mistral AI, Amazon Nova, and more through a single API on AWS infrastructure.

Quick Start

Install LiteLLM

pip install litellm

Set AWS Credentials

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION_NAME="us-east-1"

Make Your First Call

from litellm import completion

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello from Bedrock!"}]
)
print(response.choices[0].message.content)

Supported Models

Anthropic Claude
Meta Llama
Amazon Nova
Mistral AI
AI21 Labs

Claude models via Bedrock:

# Claude 3.7 Sonnet
response = completion(
    model="bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0",
    messages=[{"role": "user", "content": "Complex task..."}]
)

# Claude 3.5 Sonnet
response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Analyze this..."}]
)

# Claude 3.5 Haiku
response = completion(
    model="bedrock/anthropic.claude-3-5-haiku-20241022-v1:0",
    messages=[{"role": "user", "content": "Quick question..."}]
)

# Claude 3 Opus
response = completion(
    model="bedrock/anthropic.claude-3-opus-20240229-v1:0",
    messages=[{"role": "user", "content": "Deep reasoning..."}]
)

Llama models via Bedrock:

# Llama 3.3 70B
response = completion(
    model="bedrock/us.meta.llama3-3-70b-instruct-v1:0",
    messages=[{"role": "user", "content": "Write code..."}]
)

# Llama 3.2 90B Vision
response = completion(
    model="bedrock/us.meta.llama3-2-90b-instruct-v1:0",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this image"},
            {"type": "image_url", "image_url": {"url": "..."}}
        ]
    }]
)

# Llama 3.2 11B Vision
response = completion(
    model="bedrock/us.meta.llama3-2-11b-instruct-v1:0",
    messages=[{"role": "user", "content": "Hello"}]
)

Amazon’s own models:

# Nova Pro
response = completion(
    model="bedrock/us.amazon.nova-pro-v1:0",
    messages=[{"role": "user", "content": "Complex analysis..."}]
)

# Nova Lite
response = completion(
    model="bedrock/us.amazon.nova-lite-v1:0",
    messages=[{"role": "user", "content": "Quick task..."}]
)

# Nova Micro
response = completion(
    model="bedrock/us.amazon.nova-micro-v1:0",
    messages=[{"role": "user", "content": "Simple query..."}]
)

Mistral models via Bedrock:

# Mistral Large
response = completion(
    model="bedrock/mistral.mistral-large-2407-v1:0",
    messages=[{"role": "user", "content": "Analyze data..."}]
)

# Mistral Small
response = completion(
    model="bedrock/mistral.mistral-small-2402-v1:0",
    messages=[{"role": "user", "content": "Quick task..."}]
)

# Jamba 1.5
response = completion(
    model="bedrock/ai21.jamba-1-5-large-v1:0",
    messages=[{"role": "user", "content": "Generate text..."}]
)

Authentication

Environment Variables
Direct Parameters
AWS Profile
IAM Role

export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION_NAME="us-east-1"  # or us-west-2, eu-west-1, etc.

from litellm import completion

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello!"}]
)

from litellm import completion

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello!"}],
    aws_access_key_id="AKIA...",
    aws_secret_access_key="...",
    aws_region_name="us-east-1"
)

# Use named AWS profile
export AWS_PROFILE="my-profile"

from litellm import completion

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello!"}],
    aws_region_name="us-east-1"
)

When running on AWS (EC2, Lambda, ECS):

# No credentials needed - uses IAM role
from litellm import completion

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello!"}],
    aws_region_name="us-east-1"
)

Available Regions

Bedrock is available in multiple AWS regions:

Region	Code	Models
US East (N. Virginia)	`us-east-1`	All models
US West (Oregon)	`us-west-2`	All models
Europe (Frankfurt)	`eu-central-1`	Most models
Europe (Ireland)	`eu-west-1`	Most models
Asia Pacific (Singapore)	`ap-southeast-1`	Most models
Asia Pacific (Tokyo)	`ap-northeast-1`	Most models

# Use specific region
response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello!"}],
    aws_region_name="eu-west-1"
)

Streaming

from litellm import completion

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Function Calling

Use tools with Claude on Bedrock:

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }
}]

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "What's the weather in Boston?"}],
    tools=tools
)

if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    print(f"Function: {tool_call.function.name}")
    print(f"Arguments: {tool_call.function.arguments}")

Vision (Multimodal)

Use vision models like Claude or Llama 3.2 Vision:

# Claude with vision
response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {"url": "https://example.com/image.jpg"}
            }
        ]
    }]
)

# Llama 3.2 Vision
response = completion(
    model="bedrock/us.meta.llama3-2-90b-instruct-v1:0",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this"},
            {"type": "image_url", "image_url": {"url": "..."}}
        ]
    }]
)

Embeddings

Generate embeddings using Bedrock:

from litellm import embedding

# Amazon Titan Embeddings
response = embedding(
    model="bedrock/amazon.titan-embed-text-v1",
    input="Hello world"
)
print(len(response.data[0].embedding))  # 1536 dimensions

# Titan Embeddings V2
response = embedding(
    model="bedrock/amazon.titan-embed-text-v2:0",
    input="Hello world"
)

# Cohere Embeddings
response = embedding(
    model="bedrock/cohere.embed-english-v3",
    input=["Text 1", "Text 2"]
)

Reranking

Rerank documents using Cohere on Bedrock:

from litellm import rerank

response = rerank(
    model="bedrock/cohere.rerank-v3-5:0",
    query="What is machine learning?",
    documents=[
        "Machine learning is a subset of AI",
        "Python is a programming language",
        "Deep learning uses neural networks"
    ]
)

for result in response.results:
    print(f"Index: {result.index}, Score: {result.relevance_score}")

Batch Processing

Process requests asynchronously:

from litellm import create_batch, retrieve_batch

batch = create_batch(
    custom_llm_provider="bedrock",
    input_file_id="s3://bucket/input.jsonl",
    endpoint="/invoke",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")

# Check status
batch_status = retrieve_batch(
    custom_llm_provider="bedrock",
    batch_id=batch.id
)

Converse API vs Invoke API

Bedrock supports two APIs:

Converse API (Recommended)
Invoke API (Legacy)

Unified API across all models:

# Automatically uses Converse API by default
response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello"}]
)

Benefits:

Consistent interface across models
Better support for multi-turn conversations
Supports all model features

Model-specific API:

# Force Invoke API
response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello"}],
    aws_bedrock_use_converse_api=False
)

Cross-Region Inference

Use cross-region inference profiles:

# Cross-region profile
response = completion(
    model="bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello"}],
    aws_region_name="us-east-1"
)

Guardrails

Apply AWS Bedrock Guardrails:

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello"}],
    guardrails={
        "id": "your-guardrail-id",
        "version": "1"
    }
)

Advanced Parameters

Temperature and Sampling

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Be creative"}],
    temperature=0.9,
    top_p=0.95,
    max_tokens=1000
)

System Messages

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)

Error Handling

from litellm import completion
from litellm.exceptions import (
    AuthenticationError,
    RateLimitError,
    APIError
)

try:
    response = completion(
        model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
        messages=[{"role": "user", "content": "Hello!"}]
    )
except AuthenticationError:
    print("AWS credentials invalid")
except RateLimitError:
    print("Bedrock throttling limit hit")
except APIError as e:
    print(f"Bedrock error: {e}")

Cost Tracking

from litellm import completion, completion_cost

response = completion(
    model="bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": "Hello!"}]
)

cost = completion_cost(completion_response=response)
print(f"Cost: ${cost:.6f}")

print(f"Input tokens: {response.usage.prompt_tokens}")
print(f"Output tokens: {response.usage.completion_tokens}")

Model Access

Before using models, ensure they’re enabled in your AWS account:

Go to AWS Bedrock console
Navigate to “Model access”
Request access for desired models
Wait for approval (usually instant for most models)

Best Practices

Use IAM Roles

When on AWS, use IAM roles instead of access keys for better security.

Choose Right Region

Select a region close to your users for lower latency.

Enable Model Access

Request model access in Bedrock console before use.

Use Converse API

Prefer Converse API for better compatibility across models.

Anthropic

Learn about Claude-specific features

Function Calling

Implement tool use on Bedrock

Embeddings

Generate embeddings on Bedrock

Streaming

Stream responses in real-time

Providers

Provider Features

Overview

Quick Start

Supported Models

Authentication

Available Regions

Streaming

Function Calling

Vision (Multimodal)

Embeddings

Reranking

Batch Processing

Converse API vs Invoke API

Cross-Region Inference

Guardrails

Advanced Parameters

Temperature and Sampling

System Messages

Error Handling

Cost Tracking

Model Access

Best Practices

Use IAM Roles

Choose Right Region

Enable Model Access

Use Converse API

Anthropic

Function Calling

Embeddings

Streaming

Build docs developers (and LLMs) love

Providers

Provider Features

Documentation Index

​Overview

​Quick Start

​Supported Models

​Authentication

​Available Regions

​Streaming

​Function Calling

​Vision (Multimodal)

​Embeddings

​Reranking

​Batch Processing

​Converse API vs Invoke API

​Cross-Region Inference

​Guardrails

​Advanced Parameters

​Temperature and Sampling

​System Messages

​Error Handling

​Cost Tracking

​Model Access

​Best Practices

Use IAM Roles

Choose Right Region

Enable Model Access

Use Converse API

​Related Documentation

Anthropic

Function Calling

Embeddings

Streaming

Build docs developers (and LLMs) love

Overview

Quick Start

Supported Models

Authentication

Available Regions

Streaming

Function Calling

Vision (Multimodal)

Embeddings

Reranking

Batch Processing

Converse API vs Invoke API

Cross-Region Inference

Guardrails

Advanced Parameters

Temperature and Sampling

System Messages

Error Handling

Cost Tracking

Model Access

Best Practices

Related Documentation