Azure OpenAI Integration

Azure OpenAI Service provides enterprise-ready deployment of OpenAI models with Microsoft Azure’s security, compliance, and regional availability.

Installation

Azure OpenAI support is included in the base installation:

pip install graphiti-core

Prerequisites

Azure OpenAI Resource: Deployed in Azure Portal
Model Deployments: Deploy your chosen models (e.g., gpt-4.1, text-embedding-3-small)
API Keys: Retrieve from Azure Portal

Configuration

Environment Variables

.env

AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com
AZURE_OPENAI_API_KEY=your-api-key
AZURE_OPENAI_DEPLOYMENT=gpt-4.1
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-3-small

Basic Setup

Initialize Graphiti with Azure OpenAI:

import os
from openai import AsyncOpenAI
from graphiti_core import Graphiti
from graphiti_core.llm_client.azure_openai_client import AzureOpenAILLMClient
from graphiti_core.llm_client.config import LLMConfig
from graphiti_core.embedder.azure_openai import AzureOpenAIEmbedderClient

# Initialize Azure OpenAI client using v1 API endpoint
azure_client = AsyncOpenAI(
    base_url=f"{os.environ['AZURE_OPENAI_ENDPOINT']}/openai/v1/",
    api_key=os.environ['AZURE_OPENAI_API_KEY']
)

# Create LLM client
llm_client = AzureOpenAILLMClient(
    azure_client=azure_client,
    config=LLMConfig(
        model=os.environ['AZURE_OPENAI_DEPLOYMENT'],
        small_model=os.environ['AZURE_OPENAI_DEPLOYMENT']
    )
)

# Create embedder client
embedder_client = AzureOpenAIEmbedderClient(
    azure_client=azure_client,
    model=os.environ['AZURE_OPENAI_EMBEDDING_DEPLOYMENT']
)

# Initialize Graphiti
graphiti = Graphiti(
    "bolt://localhost:7687",
    "neo4j",
    "password",
    llm_client=llm_client,
    embedder=embedder_client
)

API Endpoint Format

Azure OpenAI uses a different endpoint structure than OpenAI: Correct format:

base_url="https://your-resource-name.openai.azure.com/openai/v1/"

Important notes:

Use the v1 API endpoint (/openai/v1/)
The standard AsyncOpenAI client works with Azure’s v1 API
Deployment names (not model names) are used in requests

Supported Models

Language Models

Deploy any of these models in Azure:

gpt-4.1: Latest GPT-4.1 model
gpt-4.1-mini: Cost-effective mini model
gpt-5-mini: Reasoning model with extended thinking
gpt-5: Advanced reasoning model
gpt-4o: Optimized GPT-4
o1, o3: Specialized reasoning models

Embedding Models

text-embedding-3-small: 1536 dimensions, cost-effective
text-embedding-3-large: 3072 dimensions, highest quality
text-embedding-ada-002: Legacy model

Model Deployment Names

In Azure, you create deployments with custom names for each model:

# Your deployment names in Azure Portal
config=LLMConfig(
    model="my-gpt4-deployment",      # Your deployment name
    small_model="my-gpt4-deployment" # Can be the same
)

Configuration Options

LLM Client

Parameter	Type	Default	Description
`azure_client`	AsyncOpenAI	Required	Azure OpenAI client instance
`config`	LLMConfig	None	LLM configuration
`max_tokens`	int	`8192`	Maximum tokens to generate
`reasoning`	str	None	Reasoning effort for reasoning models
`verbosity`	str	None	Verbosity for reasoning models

Embedder Client

Parameter	Type	Default	Description
`azure_client`	AsyncOpenAI	Required	Azure OpenAI client instance
`model`	str	`"text-embedding-3-small"`	Embedding deployment name

Structured Output Support

Azure OpenAI supports structured outputs: For regular models (GPT-4o, etc.):

Uses beta.chat.completions.parse API
Native Pydantic model validation

For reasoning models (GPT-5, o1, o3):

Uses responses.parse API
Supports reasoning and verbosity parameters

# Graphiti automatically selects the right API
# based on the model type

Reasoning Models Configuration

For GPT-5, o1, and o3 models:

llm_client = AzureOpenAILLMClient(
    azure_client=azure_client,
    config=LLMConfig(model="gpt-5-mini-deployment"),
    reasoning="high",  # low, medium, high
    verbosity="low"    # low, medium, high
)

Note: Reasoning models don’t support temperature parameter.

Complete Example

import asyncio
import os
from datetime import datetime, timezone
from dotenv import load_dotenv
from openai import AsyncOpenAI
from graphiti_core import Graphiti
from graphiti_core.llm_client.azure_openai_client import AzureOpenAILLMClient
from graphiti_core.llm_client.config import LLMConfig
from graphiti_core.embedder.azure_openai import AzureOpenAIEmbedderClient
from graphiti_core.nodes import EpisodeType

load_dotenv()

async def main():
    # Azure OpenAI configuration
    azure_endpoint = os.environ['AZURE_OPENAI_ENDPOINT']
    azure_api_key = os.environ['AZURE_OPENAI_API_KEY']
    azure_deployment = os.environ.get('AZURE_OPENAI_DEPLOYMENT', 'gpt-4.1')
    azure_embedding_deployment = os.environ.get(
        'AZURE_OPENAI_EMBEDDING_DEPLOYMENT',
        'text-embedding-3-small'
    )
    
    # Initialize Azure OpenAI client
    azure_client = AsyncOpenAI(
        base_url=f"{azure_endpoint}/openai/v1/",
        api_key=azure_api_key
    )
    
    # Create LLM and Embedder clients
    llm_client = AzureOpenAILLMClient(
        azure_client=azure_client,
        config=LLMConfig(
            model=azure_deployment,
            small_model=azure_deployment
        )
    )
    
    embedder_client = AzureOpenAIEmbedderClient(
        azure_client=azure_client,
        model=azure_embedding_deployment
    )
    
    # Initialize Graphiti
    graphiti = Graphiti(
        "bolt://localhost:7687",
        "neo4j",
        "password",
        llm_client=llm_client,
        embedder=embedder_client
    )
    
    try:
        # Add an episode
        await graphiti.add_episode(
            name="California Politics 1",
            episode_body="Kamala Harris is the Attorney General of California.",
            source=EpisodeType.text,
            reference_time=datetime.now(timezone.utc)
        )
        print("Added episode using Azure OpenAI")
        
        # Search the graph
        results = await graphiti.search("Who was the California Attorney General?")
        for result in results:
            print(f"Fact: {result.fact}")
    
    finally:
        await graphiti.close()
        print("Connection closed")

if __name__ == "__main__":
    asyncio.run(main())

Error Handling

Graphiti automatically handles:

Rate Limit Errors: Exponential backoff and retry
Validation Errors: Automatic retry with error context
Refusal Errors: Content policy violations (no retry)
API Errors: Network and service errors

When to Use Azure OpenAI

Choose Azure OpenAI if you:

Need enterprise compliance (SOC 2, HIPAA, etc.)
Want regional data residency
Require private network access (VNet integration)
Need Azure Active Directory authentication
Want cost management through Azure subscriptions
Require Service Level Agreements (SLAs)

Choose Standard OpenAI if you:

Want access to the latest models immediately
Don’t need enterprise compliance features
Prefer simpler setup and pricing

Regional Availability

Azure OpenAI is available in multiple regions:

East US, East US 2
West US, West US 2, West US 3
North Europe, West Europe
UK South
And more…

Check Azure OpenAI regions for current availability.

Rate Limits and Quotas

Azure OpenAI uses Tokens Per Minute (TPM) quotas:

.env

# Adjust based on your quota
SEMAPHORE_LIMIT=5  # Lower if you hit rate limits

Monitor usage in Azure Portal to adjust concurrency.

Security Best Practices

Use Managed Identity: Avoid API keys in production
Enable Private Endpoints: Restrict network access
Configure Azure AD: Use role-based access control
Enable Audit Logging: Track all API usage
Rotate Keys: Regularly rotate API keys

Cost Management

Use Provisioned Throughput: For predictable costs
Monitor Usage: Set up Azure cost alerts
Use Mini Models: Lower costs for simpler tasks
Batch Operations: Reduce API calls

Monitoring and Logging

Enable Azure Monitor for:

Request/response logging
Performance metrics
Cost tracking
Error analysis

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("graphiti_core")

Get Started

Core Concepts

Guides

Integrations

Advanced

Azure OpenAI Integration

Installation

Prerequisites

Configuration

Environment Variables

Basic Setup

API Endpoint Format

Supported Models

Language Models

Embedding Models

Model Deployment Names

Configuration Options

LLM Client

Embedder Client

Structured Output Support

Reasoning Models Configuration

Complete Example

Error Handling

When to Use Azure OpenAI

Regional Availability

Rate Limits and Quotas

Security Best Practices

Cost Management

Monitoring and Logging

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Integrations

Advanced

Documentation Index

​Installation

​Prerequisites

​Configuration

​Environment Variables

​Basic Setup

​API Endpoint Format

​Supported Models

​Language Models

​Embedding Models

​Model Deployment Names

​Configuration Options

​LLM Client

​Embedder Client

​Structured Output Support

​Reasoning Models Configuration

​Complete Example

​Error Handling

​When to Use Azure OpenAI

​Regional Availability

​Rate Limits and Quotas

​Security Best Practices

​Cost Management

​Monitoring and Logging

​Related Resources

Build docs developers (and LLMs) love

Installation

Prerequisites

Configuration

Environment Variables

Basic Setup

API Endpoint Format

Supported Models

Language Models

Embedding Models

Model Deployment Names

Configuration Options

LLM Client

Embedder Client

Structured Output Support

Reasoning Models Configuration

Complete Example

Error Handling

When to Use Azure OpenAI

Regional Availability

Rate Limits and Quotas

Security Best Practices

Cost Management

Monitoring and Logging

Related Resources