Use Portkey as the LLM and embedding backend in LlamaIndex pipelines and agents.
LlamaIndex is a data framework for building LLM applications over your own data, including RAG pipelines, query engines, and multi-agent systems. By setting api_base to the gateway URL, all LlamaIndex LLM calls route through Portkey.
Set api_base on the OpenAI LLM class to point at the gateway.
from llama_index.llms.openai import OpenAIllm = OpenAI( model="gpt-4o", api_base="http://localhost:8787/v1", api_key="sk-***")response = llm.complete("What is the Portkey AI Gateway?")print(response.text)
Real-world use case: multi-agent system with multiple LLMs
Route different agents to different LLMs through the gateway, with full observability over every request.
from llama_index.llms.openai import OpenAIfrom llama_index.core.agent import ReActAgentfrom llama_index.core.tools import FunctionToolfrom portkey_ai import PORTKEY_GATEWAY_URL, createHeaders# Orchestrator using GPT-4ogpt_4o_config = { "provider": "openai", "api_key": "sk-***", "override_params": {"model": "gpt-4o"}}gpt_4o = OpenAI( api_base=PORTKEY_GATEWAY_URL, default_headers=createHeaders( api_key="YOUR_PORTKEY_API_KEY", config=gpt_4o_config ))# Worker using a faster modelllama3_config = { "provider": "groq", "api_key": "gsk-***", "override_params": {"model": "llama3-70b-8192"}}llama3 = OpenAI( api_base=PORTKEY_GATEWAY_URL, default_headers=createHeaders( api_key="YOUR_PORTKEY_API_KEY", config=llama3_config ))def get_the_secret_fact() -> str: """Returns the secret fact.""" return "A baby llama is called a 'Cria'."tool = FunctionTool.from_defaults(fn=get_the_secret_fact)agent1 = ReActAgent.from_tools([tool], llm=llama3)agent2 = ReActAgent.from_tools([], llm=llama3)
When using the hosted gateway, all LlamaIndex requests appear in the Portkey observability dashboard with token counts, latency, and cost — without any additional instrumentation.