OpenAI Clients

OpenAIClient

The primary client for OpenAI’s GPT models with support for structured outputs using the responses.parse API.

Installation

pip install graphiti-core

The OpenAI SDK is included by default.

Basic Usage

from graphiti_core.llm_client import OpenAIClient
from graphiti_core.llm_client.config import LLMConfig
from graphiti_core.prompts.models import Message
from pydantic import BaseModel

# Initialize client
client = OpenAIClient(
    config=LLMConfig(
        api_key="sk-...",
        model="gpt-4.1-mini",
        temperature=1.0,
        max_tokens=16384
    )
)

# Define response structure
class ExtractedInfo(BaseModel):
    name: str
    age: int
    occupation: str

# Generate structured response
messages = [
    Message(role="system", content="Extract person information from text."),
    Message(role="user", content="John is a 30 year old software engineer.")
]

response = await client.generate_response(
    messages=messages,
    response_model=ExtractedInfo
)

print(response)  # {'name': 'John', 'age': 30, 'occupation': 'software engineer'}

Constructor

config

LLMConfig | None

default:"None"

Configuration object. If None, creates default config.

cache

bool

default:"False"

Enable response caching (not currently implemented, raises NotImplementedError if True)

client

Any | None

default:"None"

Optional pre-configured AsyncOpenAI client instance. If not provided, creates one from config.

max_tokens

int

default:"16384"

Maximum output tokens. Defaults to 16384 for compatibility.

reasoning

str

default:"'minimal'"

Reasoning effort level for reasoning models (GPT-5, o1, o3). Options: 'minimal', 'low', 'medium', 'high'

verbosity

str

default:"'low'"

Verbosity level for reasoning models. Options: 'low', 'medium', 'high'

Supported Models

Reasoning Models (via responses.parse API):

gpt-5-* series
o1-* series
o3-* series

Standard Models (via chat.completions.create):

gpt-4.1-mini (recommended)
gpt-4.1-nano
gpt-4o
gpt-4-turbo
All other GPT models

Reasoning models (GPT-5, o1, o3) do not support temperature settings. The client automatically omits temperature for these models.

Reasoning Model Configuration

For GPT-5 and o-series models, configure reasoning depth:

client = OpenAIClient(
    config=LLMConfig(
        api_key="sk-...",
        model="gpt-5-preview"
    ),
    reasoning="high",      # More thorough reasoning
    verbosity="medium"     # Detailed output
)

Custom Base URL

Use OpenAI-compatible endpoints:

client = OpenAIClient(
    config=LLMConfig(
        api_key="your-key",
        base_url="https://api.your-provider.com/v1"
    )
)

Response Format

The client uses different APIs based on model capabilities: Reasoning Models (responses.parse):

response = await client.responses.parse(
    model="gpt-5-preview",
    input=messages,
    max_output_tokens=max_tokens,
    text_format=response_model,
    reasoning={'effort': 'minimal'},
    text={'verbosity': 'low'}
)

Standard Models (chat.completions.create):

response = await client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=messages,
    temperature=1.0,
    max_tokens=max_tokens,
    response_format={'type': 'json_object'}
)

OpenAIGenericClient

A simplified OpenAI client designed for local and third-party OpenAI-compatible models. Does not support caching or the responses.parse API.

When to Use

Local models (e.g., Ollama, LM Studio)
Third-party OpenAI-compatible APIs
Models with higher token limits
Simpler integration requirements

Basic Usage

from graphiti_core.llm_client import OpenAIGenericClient
from graphiti_core.llm_client.config import LLMConfig

# For local Ollama instance
client = OpenAIGenericClient(
    config=LLMConfig(
        base_url="http://localhost:11434/v1",
        model="llama3",
        api_key="not-needed"  # Ollama doesn't require key
    ),
    max_tokens=32000  # Higher limit for local models
)

Constructor

config

LLMConfig | None

default:"None"

Configuration object. If None, creates default config.

cache

bool

default:"False"

Caching is not supported. Raises NotImplementedError if True.

client

Any | None

default:"None"

Optional pre-configured AsyncOpenAI client instance.

max_tokens

int

default:"16384"

Maximum output tokens. Default increased to 16384 for better local model compatibility.

Key Differences from OpenAIClient

Feature	OpenAIClient	OpenAIGenericClient
Caching	Supported (not implemented)	Not supported
responses.parse API	Yes (reasoning models)	No
Structured outputs	Via responses.parse	Via json_schema
Max retries	2 (configurable)	2 (fixed)
Default max_tokens	16384	16384
Reasoning/verbosity	Yes	No

Structured Output Handling

Uses json_schema in response format:

response_format = {
    'type': 'json_schema',
    'json_schema': {
        'name': 'structured_response',
        'schema': response_model.model_json_schema()
    }
}

Error Handling

Implements custom retry logic:

Max 2 retries on validation/parsing errors
No retry for rate limits or refusals
Automatic retry for OpenAI client errors (timeout, connection, server errors)
Appends error context to messages for model self-correction

Example: Local Model

from graphiti_core.llm_client import OpenAIGenericClient
from graphiti_core.llm_client.config import LLMConfig
from pydantic import BaseModel

class Summary(BaseModel):
    title: str
    key_points: list[str]

client = OpenAIGenericClient(
    config=LLMConfig(
        base_url="http://localhost:11434/v1",
        model="llama3:70b"
    ),
    max_tokens=8192
)

messages = [
    Message(role="system", content="Summarize the following text."),
    Message(role="user", content="Long article text...")
]

summary = await client.generate_response(
    messages=messages,
    response_model=Summary
)

Compatibility Notes

Works with any OpenAI-compatible API
Does not use provider-specific features
JSON schema support required for structured outputs
Temperature and max_tokens always included in requests

Core

Data Models

Drivers

LLM Clients

Embedders

OpenAIClient

Installation

Basic Usage

Constructor

Supported Models

Reasoning Model Configuration

Custom Base URL

Response Format

OpenAIGenericClient

When to Use

Basic Usage

Constructor

Key Differences from OpenAIClient

Structured Output Handling

Error Handling

Example: Local Model

Compatibility Notes

Build docs developers (and LLMs) love

Core

Data Models

Drivers

LLM Clients

Embedders

Documentation Index

​OpenAIClient

​Installation

​Basic Usage

​Constructor

​Supported Models

​Reasoning Model Configuration

​Custom Base URL

​Response Format

​OpenAIGenericClient

​When to Use

​Basic Usage

​Constructor

​Key Differences from OpenAIClient

​Structured Output Handling

​Error Handling

​Example: Local Model

​Compatibility Notes

Build docs developers (and LLMs) love

OpenAIClient

Installation

Basic Usage

Constructor

Supported Models

Reasoning Model Configuration

Custom Base URL

Response Format

OpenAIGenericClient

When to Use

Basic Usage

Constructor

Key Differences from OpenAIClient

Structured Output Handling

Error Handling

Example: Local Model

Compatibility Notes