OpenRouter

OpenRouter provides a unified API for accessing hundreds of AI models from multiple providers including OpenAI, Anthropic, Google, Meta, and more. The langchain-openrouter integration allows you to use these models in your LangChain applications.

Installation

Install the langchain-openrouter package:

pip install langchain-openrouter

Setup

Get an API key from OpenRouter and set it as an environment variable:

export OPENROUTER_API_KEY="your-api-key"

Basic Usage

from langchain_openrouter import ChatOpenRouter

model = ChatOpenRouter(
    model="anthropic/claude-sonnet-4-5",
    temperature=0
)

response = model.invoke("What is the capital of France?")
print(response.content)

Available Models

OpenRouter provides access to many models. Popular options include:

openai/gpt-4o - OpenAI GPT-4o
openai/gpt-4o-mini - OpenAI GPT-4o Mini (cost-effective)
anthropic/claude-sonnet-4-5 - Anthropic Claude Sonnet 4.5
anthropic/claude-opus-4 - Anthropic Claude Opus 4
google/gemini-2.0-flash-exp - Google Gemini 2.0 Flash
meta-llama/llama-3.3-70b-instruct - Meta Llama 3.3 70B
openai/o3-mini - OpenAI O3 Mini (reasoning model)

For a complete list, visit OpenRouter Models.

Configuration

Model Parameters

from langchain_openrouter import ChatOpenRouter

model = ChatOpenRouter(
    model="openai/gpt-4o-mini",
    temperature=0.7,
    max_tokens=1000,
    top_p=0.9,
    frequency_penalty=0.0,
    presence_penalty=0.0,
    seed=42  # For reproducibility
)

Provider Preferences

Control which provider handles your request:

model = ChatOpenRouter(
    model="anthropic/claude-sonnet-4-5",
    openrouter_provider={"order": ["Anthropic", "AWS"]}
)

App Attribution

Set your app information for OpenRouter attribution:

model = ChatOpenRouter(
    model="openai/gpt-4o-mini",
    app_url="https://myapp.com",
    app_title="My App"
)

Retry Configuration

model = ChatOpenRouter(
    model="openai/gpt-4o-mini",
    max_retries=3,  # Number of retry attempts
    timeout=30000   # Timeout in milliseconds
)

Streaming

from langchain_openrouter import ChatOpenRouter

model = ChatOpenRouter(
    model="openai/gpt-4o-mini",
    streaming=True
)

for chunk in model.stream("Tell me a story"):
    print(chunk.content, end="", flush=True)

Tool Calling

OpenRouter supports tool calling with compatible models:

from langchain_openrouter import ChatOpenRouter
from pydantic import BaseModel, Field

class GetWeather(BaseModel):
    """Get the current weather in a given location."""
    location: str = Field(description="The city and state, e.g. San Francisco, CA")

model = ChatOpenRouter(model="openai/gpt-4o-mini")
model_with_tools = model.bind_tools([GetWeather])

response = model_with_tools.invoke("What's the weather in San Francisco?")
print(response.tool_calls)

Structured Output

Generate structured outputs using function calling or JSON schema:

Using Function Calling

from langchain_openrouter import ChatOpenRouter
from pydantic import BaseModel, Field

class Joke(BaseModel):
    """A joke with setup and punchline."""
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline of the joke")

model = ChatOpenRouter(model="openai/gpt-4o-mini")
structured_model = model.with_structured_output(Joke)

result = structured_model.invoke("Tell me a joke about programming")
print(f"Setup: {result.setup}")
print(f"Punchline: {result.punchline}")

Using JSON Schema

model = ChatOpenRouter(model="openai/gpt-4o-mini")
structured_model = model.with_structured_output(
    Joke,
    method="json_schema"
)

result = structured_model.invoke("Tell me a joke")

Reasoning Models

OpenRouter supports reasoning models with configurable reasoning effort:

from langchain_openrouter import ChatOpenRouter

model = ChatOpenRouter(
    model="openai/o3-mini",
    reasoning={
        "effort": "high",  # Options: xhigh, high, medium, low, minimal, none
        "summary": "auto"  # Options: auto, concise, detailed
    }
)

response = model.invoke("Solve this math problem: What is 123 * 456?")
print(response.content)

# Access reasoning content
if "reasoning_content" in response.additional_kwargs:
    print(f"Reasoning: {response.additional_kwargs['reasoning_content']}")

Some models support images and other media:

from langchain_openrouter import ChatOpenRouter
from langchain_core.messages import HumanMessage

model = ChatOpenRouter(model="openai/gpt-4o")

message = HumanMessage(
    content=[
        {"type": "text", "text": "What's in this image?"},
        {
            "type": "image_url",
            "image_url": {"url": "https://example.com/image.jpg"}
        }
    ]
)

response = model.invoke([message])
print(response.content)

Cost Tracking

OpenRouter provides cost information in response metadata:

from langchain_openrouter import ChatOpenRouter

model = ChatOpenRouter(model="openai/gpt-4o-mini")
response = model.invoke("Hello!")

# Access cost data
if "cost" in response.response_metadata:
    print(f"Cost: ${response.response_metadata['cost']}")

# Access token usage
if response.usage_metadata:
    print(f"Input tokens: {response.usage_metadata['input_tokens']}")
    print(f"Output tokens: {response.usage_metadata['output_tokens']}")

Using with Agents

from langchain_openrouter import ChatOpenRouter
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool

@tool
def get_current_time() -> str:
    """Get the current time."""
    from datetime import datetime
    return datetime.now().strftime("%H:%M:%S")

model = ChatOpenRouter(model="openai/gpt-4o-mini")
tools = [get_current_time]

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}")
])

agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

result = agent_executor.invoke({"input": "What time is it?"})
print(result["output"])

Using in RAG Applications

from langchain_openrouter import ChatOpenRouter
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Set up vector store (example)
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_texts(
    ["Paris is the capital of France.", "London is the capital of England."],
    embedding=embeddings
)
retriever = vectorstore.as_retriever()

# Create RAG chain
model = ChatOpenRouter(model="anthropic/claude-sonnet-4-5")

prompt = ChatPromptTemplate.from_template(
    """Answer based on the context:

{context}

Question: {question}"""
)

def format_docs(docs):
    return "\n\n".join([doc.page_content for doc in docs])

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

answer = chain.invoke("What is the capital of France?")
print(answer)

Advanced Configuration

Custom Base URL

model = ChatOpenRouter(
    model="openai/gpt-4o-mini",
    base_url="https://custom-proxy.example.com"
)

Route Preferences

model = ChatOpenRouter(
    model="openai/gpt-4o-mini",
    route="fallback"  # Use fallback routing
)

Multiple Completions

model = ChatOpenRouter(
    model="openai/gpt-4o-mini",
    n=3  # Generate 3 completions
)

response = model.invoke("Tell me a joke")
# Access multiple generations from response

Error Handling

from langchain_openrouter import ChatOpenRouter

model = ChatOpenRouter(model="openai/gpt-4o-mini")

try:
    response = model.invoke("Hello!")
    print(response.content)
except ValueError as e:
    print(f"API error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

API Reference

For detailed API documentation, see:

ChatOpenRouter API Reference

Overview

Chat Models

Embeddings

Vector Stores

Tools

Installation

Setup

Basic Usage

Available Models

Configuration

Model Parameters

Provider Preferences

App Attribution

Retry Configuration

Streaming

Tool Calling

Structured Output

Using Function Calling

Using JSON Schema

Reasoning Models

Cost Tracking

Using with Agents

Using in RAG Applications

Advanced Configuration

Custom Base URL

Route Preferences

Multiple Completions

Error Handling

API Reference

Resources

Build docs developers (and LLMs) love

Overview

Chat Models

Embeddings

Vector Stores

Tools

Documentation Index

​Installation

​Setup

​Basic Usage

​Available Models

​Configuration

​Model Parameters

​Provider Preferences

​App Attribution

​Retry Configuration

​Streaming

​Tool Calling

​Structured Output

​Using Function Calling

​Using JSON Schema

​Reasoning Models

​Multi-modal Input

​Cost Tracking

​Using with Agents

​Using in RAG Applications

​Advanced Configuration

​Custom Base URL

​Route Preferences

​Multiple Completions

​Error Handling

​API Reference

​Resources

Build docs developers (and LLMs) love

Installation

Setup

Basic Usage

Available Models

Configuration

Model Parameters

Provider Preferences

App Attribution

Retry Configuration

Streaming

Tool Calling

Structured Output

Using Function Calling

Using JSON Schema

Reasoning Models

Multi-modal Input

Cost Tracking

Using with Agents

Using in RAG Applications

Advanced Configuration

Custom Base URL

Route Preferences

Multiple Completions

Error Handling

API Reference

Resources