Documentation Index
Fetch the complete documentation index at: https://mintlify.com/getzep/graphiti/llms.txt
Use this file to discover all available pages before exploring further.
OpenAI is the default and recommended LLM provider for Graphiti, offering state-of-the-art language models with structured output support.
Installation
OpenAI support is included in the base installation:
pip install graphiti-core
Configuration
Environment Variables
Graphiti automatically detects and uses the OPENAI_API_KEY environment variable.
Basic Setup
By default, Graphiti uses OpenAI with no additional configuration:
from graphiti_core import Graphiti
# Uses OpenAI by default
graphiti = Graphiti(
"bolt://localhost:7687",
"neo4j",
"password"
)
Custom LLM Configuration
Customize the OpenAI LLM client:
from graphiti_core import Graphiti
from graphiti_core.llm_client.openai_client import OpenAIClient
from graphiti_core.llm_client.config import LLMConfig
# Create custom LLM client
llm_client = OpenAIClient(
config=LLMConfig(
api_key="sk-...",
model="gpt-4.1-mini",
small_model="gpt-4.1-mini",
temperature=0.7,
max_tokens=8192
)
)
# Initialize Graphiti with custom client
graphiti = Graphiti(
"bolt://localhost:7687",
"neo4j",
"password",
llm_client=llm_client
)
Embeddings Configuration
Customize the OpenAI embeddings:
from graphiti_core import Graphiti
from graphiti_core.embedder.openai import OpenAIEmbedder, OpenAIEmbedderConfig
# Create custom embedder
embedder = OpenAIEmbedder(
config=OpenAIEmbedderConfig(
api_key="sk-...",
embedding_model="text-embedding-3-small",
embedding_dim=1536
)
)
# Initialize Graphiti with custom embedder
graphiti = Graphiti(
"bolt://localhost:7687",
"neo4j",
"password",
embedder=embedder
)
Supported Models
Language Models
- gpt-4.1-mini (recommended): Latest mini model with great performance
- gpt-4.1: Full GPT-4.1 model for complex tasks
- gpt-5-mini: Reasoning model with extended thinking
- gpt-5: Advanced reasoning model
- o1, o3: Specialized reasoning models
Embedding Models
- text-embedding-3-small (default): 1536 dimensions, cost-effective
- text-embedding-3-large: 3072 dimensions, highest quality
- text-embedding-ada-002: Legacy model, 1536 dimensions
LLM Configuration Options
| Parameter | Type | Default | Description |
|---|
api_key | str | From env | OpenAI API key |
model | str | "gpt-4.1-mini" | Primary LLM model |
small_model | str | "gpt-4.1-mini" | Model for simpler tasks |
temperature | float | 0.7 | Sampling temperature (0-2) |
max_tokens | int | 8192 | Maximum tokens to generate |
base_url | str | None | Custom API endpoint (for proxies) |
Embedder Configuration Options
| Parameter | Type | Default | Description |
|---|
api_key | str | From env | OpenAI API key |
embedding_model | str | "text-embedding-3-small" | Embedding model |
embedding_dim | int | 1536 | Embedding dimensions |
base_url | str | None | Custom API endpoint |
Structured Output Support
OpenAI models support native structured outputs via the Responses API:
# Graphiti automatically uses structured outputs
# for entity extraction, relationship detection, etc.
# No additional configuration needed!
Benefits:
- Guaranteed valid JSON responses
- Type-safe Pydantic model outputs
- Reduced parsing errors and retries
Reasoning Models
For GPT-5, o1, and o3 models, Graphiti automatically handles:
- Temperature: Disabled (not supported by reasoning models)
- Max Tokens: Adjusted to model-specific limits
- Reasoning Effort: Configurable thinking depth
- Verbosity: Control reasoning output detail
from graphiti_core.llm_client.openai_client import OpenAIClient
from graphiti_core.llm_client.config import LLMConfig
llm_client = OpenAIClient(
config=LLMConfig(model="gpt-5-mini"),
reasoning="high", # low, medium, high
verbosity="low" # low, medium, high
)
Complete Example
import asyncio
import os
from datetime import datetime, timezone
from graphiti_core import Graphiti
from graphiti_core.llm_client.openai_client import OpenAIClient
from graphiti_core.llm_client.config import LLMConfig
from graphiti_core.embedder.openai import OpenAIEmbedder, OpenAIEmbedderConfig
from graphiti_core.nodes import EpisodeType
async def main():
# Configure OpenAI LLM
llm_client = OpenAIClient(
config=LLMConfig(
api_key=os.environ["OPENAI_API_KEY"],
model="gpt-4.1-mini",
temperature=0.7
)
)
# Configure OpenAI embeddings
embedder = OpenAIEmbedder(
config=OpenAIEmbedderConfig(
api_key=os.environ["OPENAI_API_KEY"],
embedding_model="text-embedding-3-small"
)
)
# Initialize Graphiti
graphiti = Graphiti(
"bolt://localhost:7687",
"neo4j",
"password",
llm_client=llm_client,
embedder=embedder
)
try:
# Add an episode
await graphiti.add_episode(
name="Tech News 1",
episode_body="OpenAI released GPT-5, featuring advanced reasoning capabilities.",
source=EpisodeType.text,
reference_time=datetime.now(timezone.utc)
)
# Search the graph
results = await graphiti.search("What are the new OpenAI features?")
for result in results:
print(f"Fact: {result.fact}")
finally:
await graphiti.close()
if __name__ == "__main__":
asyncio.run(main())
Rate Limiting
Graphiti defaults to low concurrency to avoid rate limits:
# Increase concurrency if you have higher rate limits
SEMAPHORE_LIMIT=10 # Default: 10 concurrent operations
If you encounter 429 errors, reduce the concurrency limit.
Error Handling
Graphiti automatically handles:
- Rate Limit Errors: Exponential backoff and retry
- Validation Errors: Automatic retry with error context
- Refusal Errors: Content policy violations (no retry)
- Timeout Errors: Network and API timeouts
Cost Optimization
- Use Mini Models:
gpt-4.1-mini for most tasks
- Batch Operations: Process multiple items together
- Cache Results: Enable caching for repeated queries
- Token Limits: Adjust
max_tokens based on needs