Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/getzep/graphiti/llms.txt

Use this file to discover all available pages before exploring further.

Embeddings power semantic search in Graphiti by converting text into vector representations. Configure your preferred embedding provider to enable similarity-based retrieval.

Supported Providers

OpenAI

text-embedding-3-small, text-embedding-3-large

Azure OpenAI

OpenAI embeddings on Azure infrastructure

Google Gemini

text-embedding-004 and models-embedding-001

Voyage AI

voyage-3, voyage-3-lite for domain-specific embeddings

Default Provider (OpenAI)

By default, Graphiti uses OpenAI’s text-embedding-3-small:
from graphiti_core import Graphiti
import os

# Set your API key
os.environ["OPENAI_API_KEY"] = "sk-..."

# Uses OpenAI embeddings by default
graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password"
)

OpenAI Embeddings

Basic Configuration

from graphiti_core import Graphiti
from graphiti_core.embedder import OpenAIEmbedder, OpenAIEmbedderConfig

# Configure OpenAI embedder
embedder_config = OpenAIEmbedderConfig(
    api_key="sk-...",
    embedding_model="text-embedding-3-small",
    embedding_dim=1536
)

embedder = OpenAIEmbedder(config=embedder_config)

graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password",
    embedder=embedder
)

Configuration Options

ParameterTypeDefaultDescription
api_keystrFrom envOpenAI API key
embedding_modelstr"text-embedding-3-small"Embedding model name
embedding_dimint1024Embedding vector dimensions
base_urlstrNoneCustom API endpoint

Available Models

  • Dimensions: 1536 (default) or lower
  • Best for: General use, cost-effective
  • Price: $0.02 per 1M tokens

Custom Dimensions

Reduce dimensions for lower storage costs:
embedder_config = OpenAIEmbedderConfig(
    embedding_model="text-embedding-3-small",
    embedding_dim=512  # Reduce from default 1536
)
Lower dimensions may reduce search quality. Test before deploying to production.

Azure OpenAI Embeddings

Use OpenAI embeddings deployed on Azure:
from graphiti_core import Graphiti
from graphiti_core.embedder.azure_openai import AzureOpenAIEmbedderClient
from openai import AsyncOpenAI

# Create Azure OpenAI client
azure_client = AsyncOpenAI(
    base_url="https://your-resource.openai.azure.com/openai/v1/",
    api_key="your-azure-api-key"
)

# Configure embedder
embedder = AzureOpenAIEmbedderClient(
    azure_client=azure_client,
    model="text-embedding-3-small"  # Your Azure deployment name
)

graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password",
    embedder=embedder
)

Environment Variables

AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_API_KEY=your-key
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-3-small

Google Gemini Embeddings

Use Google’s embedding models:
from graphiti_core import Graphiti
from graphiti_core.embedder.gemini import GeminiEmbedder, GeminiEmbedderConfig

# Configure Gemini embedder
embedder_config = GeminiEmbedderConfig(
    api_key="your-google-api-key",
    embedding_model="text-embedding-004",
    embedding_dim=768
)

embedder = GeminiEmbedder(config=embedder_config)

graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password",
    embedder=embedder
)

Environment Variables

GOOGLE_API_KEY=your-key

Available Models

  • text-embedding-004 - Latest model, 768 dimensions
  • models/embedding-001 - Earlier model, 768 dimensions

Voyage AI Embeddings

Use Voyage AI for specialized domain embeddings:

Installation

pip install graphiti-core[voyageai]

Configuration

from graphiti_core import Graphiti
from graphiti_core.embedder.voyage import VoyageAIEmbedder, VoyageAIEmbedderConfig

# Configure Voyage AI embedder
embedder_config = VoyageAIEmbedderConfig(
    api_key="pa-...",
    embedding_model="voyage-3",
    embedding_dim=1024
)

embedder = VoyageAIEmbedder(config=embedder_config)

graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password",
    embedder=embedder
)

Environment Variables

VOYAGE_API_KEY=pa-...

Available Models

  • Dimensions: 1024
  • Best for: General-purpose, state-of-the-art quality
  • Context: 32K tokens

Custom Embedding Clients

Implement a custom embedder by extending EmbedderClient:
from graphiti_core.embedder.client import EmbedderClient, EmbedderConfig
from collections.abc import Iterable

class CustomEmbedderConfig(EmbedderConfig):
    embedding_model: str = "custom-model"
    api_endpoint: str = "https://api.example.com"

class CustomEmbedder(EmbedderClient):
    def __init__(self, config: CustomEmbedderConfig):
        self.config = config
        # Initialize your client here
    
    async def create(
        self, 
        input_data: str | list[str] | Iterable[int] | Iterable[Iterable[int]]
    ) -> list[float]:
        # Generate embedding for single input
        # Return a list of floats
        pass
    
    async def create_batch(self, input_data_list: list[str]) -> list[list[float]]:
        # Generate embeddings for batch input
        # Return list of embedding vectors
        pass

# Use custom embedder
embedder = CustomEmbedder(config=CustomEmbedderConfig())
graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password",
    embedder=embedder
)

Embedding Usage

Embeddings are generated automatically for:
  • Entity Names - For entity similarity search
  • Relationship Facts - For semantic fact retrieval
  • Community Summaries - For cluster-based search
  • Episode Content - For source document search
# Add episode - embeddings generated automatically
result = await graphiti.add_episode(
    name="Example",
    episode_body="Alice is a software engineer at Google.",
    source=EpisodeType.text,
    source_description="Bio",
    reference_time=datetime.now(timezone.utc)
)

# Nodes have name embeddings
for node in result.nodes:
    if node.name_embedding:
        print(f"Embedding dims: {len(node.name_embedding)}")

# Edges have fact embeddings
for edge in result.edges:
    if edge.fact_embedding:
        print(f"Embedding dims: {len(edge.fact_embedding)}")

Batch Embedding Generation

Graphiti batches embedding requests for efficiency:
# Multiple embeddings generated in batches
await graphiti.add_episode_bulk(
    bulk_episodes=[
        RawEpisode(
            name="Episode 1",
            content="Content 1",
            source=EpisodeType.text,
            source_description="Source",
            reference_time=datetime.now(timezone.utc)
        ),
        RawEpisode(
            name="Episode 2",
            content="Content 2",
            source=EpisodeType.text,
            source_description="Source",
            reference_time=datetime.now(timezone.utc)
        )
    ]
)
# Embeddings generated in efficient batches

Choosing an Embedding Model

Quality vs Cost

Balance search quality with API costs. Start with text-embedding-3-small.

Dimension Size

Higher dimensions = better quality but more storage. 512-1024 works for most cases.

Domain Specificity

Use domain-specific models (Voyage) for specialized content.

Consistency

Don’t change embedding models after deployment - requires full re-embedding.

Cost Optimization

1

Choose efficient models

Use text-embedding-3-small or voyage-3-lite for cost savings
2

Reduce dimensions

Lower embedding_dim to reduce storage and API costs
3

Batch operations

Use add_episode_bulk() to generate embeddings in larger batches
4

Cache strategy

Consider caching embeddings for frequently used content

Performance Tips

# Use smaller dimensions for faster search
embedder_config = OpenAIEmbedderConfig(
    embedding_model="text-embedding-3-small",
    embedding_dim=512  # Faster than 1536
)

# Batch embeddings for efficiency
await graphiti.add_episode_bulk(episodes)  # Better than individual adds

# Monitor embedding generation
result = await graphiti.add_episode(...)
print(f"Nodes embedded: {len([n for n in result.nodes if n.name_embedding])}")
print(f"Edges embedded: {len([e for e in result.edges if e.fact_embedding])}")

Changing Embedding Models

Changing embedding models requires re-embedding all existing content. Embeddings from different models are not comparable.
If you need to switch models:
  1. Export your graph structure and content
  2. Create a new database
  3. Configure the new embedding model
  4. Re-ingest all episodes with the new embedder
# Export current data
episodes = await graphiti.retrieve_episodes(...)

# Create new database with new embedder
new_embedder = VoyageAIEmbedder(config=VoyageAIEmbedderConfig(
    embedding_model="voyage-3"
))

new_graphiti = Graphiti(
    uri="bolt://localhost:7687",
    user="neo4j",
    password="password",
    embedder=new_embedder
)

# Re-ingest with new embeddings
for episode in episodes:
    await new_graphiti.add_episode(...)

Next Steps

Searching

Use embeddings for semantic search

LLM Providers

Configure your LLM provider

Graph Drivers

Choose your graph database

Build docs developers (and LLMs) love