Architecture

ODAI follows a clean, four-layer architecture that separates concerns between request handling, business logic, third-party integration, and data persistence.

┌─────────────────────────────────────────┐
│      Client Applications                 │
│  (Web, Mobile, Voice Calls)             │
└─────────────────┬───────────────────────┘
                  │ WebSocket/HTTP
┌─────────────────▼───────────────────────┐
│         API Layer                        │
│  (FastAPI, Routers, WebSocket)          │
└─────────────────┬───────────────────────┘
                  │
┌─────────────────▼───────────────────────┐
│       Service Layer                      │
│  (AuthService, ChatService,             │
│   LocationService)                       │
└─────────────────┬───────────────────────┘
                  │
┌─────────────────▼───────────────────────┐
│     Integration Layer                    │
│  (Orchestrator + 30+ Connectors)        │
└─────────────────┬───────────────────────┘
                  │
┌─────────────────▼───────────────────────┐
│        Data Layer                        │
│   (Firebase/Firestore Models)            │
└─────────────────────────────────────────┘

The four layers

API layer

api.py and routers/ handle all inbound HTTP and WebSocket connections. The ODAPIApplication class wires together middleware, routers, and the WebSocket endpoint. CORS policy, authentication checks, and route registration all live here.

Service layer

services/ contains the business logic that the API layer delegates to. AuthService validates Firebase tokens, ChatService manages Firestore chat documents and tracks analytics, and LocationService resolves IP addresses to geographic context.

Integration layer

connectors/ houses the OpenAI agent definitions for each third-party service. The Orchestrator assembles a GPT-4o agent with 25+ specialist agents as handoff targets. When a request arrives, the orchestrator decides which agents to invoke, runs them in parallel where possible, and merges results.

Data layer

firebase/models/ provides Firestore-backed model classes for users, chats, OAuth tokens, token usage, and error tracking. Each model inherits from FireStoreObject and abstracts all database operations behind a clean API.

Core components

`ODAPIApplication` — `api.py`

The application entry point. The constructor initializes all services and wires them together through dependency injection:

class ODAPIApplication:
    def __init__(self):
        Settings = get_settings()
        self.settings = Settings()

        self.openai_client = OpenAI(api_key=self.settings.openai_api_key)

        self.chat_service = ChatService()
        self.api_service = APIService()
        self.connection_manager = ConnectionManager()
        self.websocket_handler = WebSocketHandler(
            self.settings, self.openai_client, self.connection_manager
        )

        self.app = self._create_app()
        self._setup_routes()

The WebSocket endpoint is mounted at /chats/{chat_id} and delegates the full connection lifecycle to WebSocketHandler:

@self.app.websocket("/chats/{chat_id}")
async def websocket_endpoint(websocket, chat_id, token, ...):
    await self.websocket_handler.handle_websocket_connection(
        websocket=websocket,
        chat_id=chat_id,
        token=token,
        ...
    )

`ChatService` — `services/chat_service.py`

Manages the lifecycle of a chat session in Firestore. Responsibilities include:

Creating or retrieving a Chat document by ID and user
Persisting message history and per-turn responses
Recording OpenAI token usage for both the chat and the user
Logging unhandled requests for future capability planning
Emitting analytics events via Segment for every significant action

`AuthService` — `services/auth_service.py`

Validates Firebase ID tokens on every connection. In production mode it additionally enforces that:

The user is not anonymous
The user has accepted the terms of service

On WebSocket authentication failure the service closes the connection with code 1008 before the chat loop begins.

async def authenticate_websocket(self, websocket, token):
    valid, user, user_anonymous = self.validate_user_token(token)
    return user, user_anonymous
    # closes websocket with code 1008 on any failure

`ConnectionManager` — `websocket/connection_manager.py`

Tracks all active WebSocket connections in memory. Provides methods to accept a new connection, remove a disconnected one, and send text or JSON messages to individual connections. The connection_count property is exposed by the /test health check endpoint.

class ConnectionManager:
    def __init__(self):
        self.active_connections: List[WebSocket] = []

    async def connect(self, websocket):
        await websocket.accept()
        self.active_connections.append(websocket)

    def disconnect(self, websocket):
        self.active_connections.remove(websocket)

`WebSocketHandler` — `websocket/handlers.py`

Orchestrates the full WebSocket request/response cycle after authentication:

Resolves or creates the Firestore Chat document via ChatService
Builds a per-user Orchestrator instance with build_dynamic_agents
Runs the main message loop: receive prompt → stream agent response → finalize
Streams raw text deltas, tool call status, agent handoffs, and suggested prompts back to the client as JSON events
Writes the completed conversation back to Firestore and records token usage

The handler uses Runner.run_streamed from the OpenAI Agents SDK and iterates over the async event stream:

result = runner.run_streamed(
    orchestrator_agent,
    chat.messages + [{"content": prompt, "role": "user"}],
    context=context,
)

async for event in result.stream_events():
    await self._process_stream_event(event, websocket, ...)

`Orchestrator` — `connectors/orchestrator.py`

Builds the root GPT-4o agent and populates its handoffs list with every specialist agent. When a user message arrives, the orchestrator applies the H.A.N.D.O.F.F. decision framework to route work to the right agent:

Letter	Criterion
H	Has capability — does the agent explicitly solve this task?
A	Access — does it have the required data or API permissions?
N	Novelty/Need — is a tool call necessary vs. answering from context?
D	Delay/Cost — prefer fewer or cheaper calls when quality is unaffected
O	Output quality — will it return the needed format?
F	Failure fallback — choose alternates if the first is likely to fail
F	Fusion — orchestrate multiple agents and merge results

The TOOL_CALLS dictionary maps every registered function tool name to a human-readable progress string that is surfaced to the client as a tool_call WebSocket event:

TOOL_CALLS = {
    "search_businesses_at_yelp": "Searching Yelp...",
    "fetch_google_email_inbox": "Fetching Inbox...",
    "get_stock_price_at_finnhub": "Getting Stock Price...",
    # 100+ entries
}

Design patterns

Layered architecture

Each layer only calls downward. The API layer delegates to services, services delegate to connectors and Firebase models. This makes each layer independently testable and replaceable.

Dependency injection

Services receive their dependencies through constructors rather than creating them directly. ODAPIApplication.__init__ composes the full object graph and passes the openai_client, settings, and connection_manager into WebSocketHandler.

Repository pattern

Firebase model classes (Chat, User, TokenUsage) abstract all Firestore operations. The service layer never writes Firestore queries directly — it calls methods like Chat.get_chat_by_id(), Chat.create_chat(), and chat.update_messages().

Agent-based architecture

Each third-party service is encapsulated as a standalone OpenAI Agent with its own system prompt, tool definitions, and optional sub-handoffs. The orchestrator composes these agents at runtime. Adding a new integration means creating a new agent file in connectors/ and appending the agent to the handoffs list in orchestrator.py.

Async/await throughout

All I/O — WebSocket communication, Firestore reads and writes, OpenAI streaming, and external API calls — uses Python’s asyncio. The server runs under uvicorn with optional uvloop for higher throughput.

Request flow

Client
  │
  │  WebSocket connect /chats/{chat_id}?token={token}
  ▼
WebSocketHandler.handle_websocket_connection()
  │  AuthService.authenticate_websocket() → Firebase token validation
  │  ChatService.get_or_create_chat()     → Firestore read/write
  │  Orchestrator.build_dynamic_agents()  → Assemble agent graph
  │
  │  loop: receive user prompt
  ▼
WebSocketHandler._process_chat_message()
  │  Runner.run_streamed(orchestrator_agent, messages, context)
  │
  ├─ stream raw_response_event (text delta)    → send to client
  ├─ stream tool_call event                    → send to client
  ├─ stream agent_updated event                → send to client
  ├─ stream tool_output event                  → send to client
  │
  ▼
WebSocketHandler._finalize_chat_interaction()
  │  send end_of_stream
  │  send suggested_prompts
  │  ChatService.update_chat_messages()        → Firestore write
  │  ChatService.record_token_usage()          → Firestore write

Project structure

backend/
├── api.py                      # ODAPIApplication entry point
├── config.py                   # Settings (local .env or Secret Manager)
├── requirements.txt            # 173 production dependencies
│
├── routers/                    # HTTP route handlers
│   ├── google.py               # Google OAuth endpoints
│   ├── plaid.py                # Plaid account linking
│   └── twilio/                 # Voice call handling
│
├── services/                   # Business logic
│   ├── auth_service.py         # Token validation, user auth
│   ├── chat_service.py         # Chat management, analytics
│   └── location_service.py     # IP geolocation
│
├── websocket/                  # Real-time communication
│   ├── connection_manager.py   # Connection registry
│   └── handlers.py             # Streaming chat loop
│
├── connectors/                 # OpenAI agent definitions
│   ├── orchestrator.py         # Root agent + TOOL_CALLS map
│   ├── voice_orchestrator.py   # Voice-specific agent
│   └── [service].py            # 30+ specialist agents
│
├── firebase/
│   ├── models/                 # Firestore document models
│   │   ├── user.py
│   │   ├── chat.py
│   │   ├── google_token.py
│   │   ├── plaid_token.py
│   │   └── token_usage.py
│   └── base.py
│
└── tests/                      # 700+ tests, 90%+ coverage
    ├── conftest.py
    └── test_*.py

Get Started

Core Concepts

Integrations

Setup & Deployment

Security

Architecture

The four layers

API layer

Service layer

Integration layer

Data layer

Core components

`ODAPIApplication` — `api.py`

`ChatService` — `services/chat_service.py`

`AuthService` — `services/auth_service.py`

`ConnectionManager` — `websocket/connection_manager.py`

`WebSocketHandler` — `websocket/handlers.py`

`Orchestrator` — `connectors/orchestrator.py`

Design patterns

Layered architecture

Dependency injection

Repository pattern

Agent-based architecture

Async/await throughout

Request flow

Project structure

Build docs developers (and LLMs) love

Get Started

Core Concepts

Integrations

Setup & Deployment

Security

Documentation Index

​The four layers

API layer

Service layer

Integration layer

Data layer

​Core components

​ODAPIApplication — api.py

​ChatService — services/chat_service.py

​AuthService — services/auth_service.py

​ConnectionManager — websocket/connection_manager.py

​WebSocketHandler — websocket/handlers.py

​Orchestrator — connectors/orchestrator.py

​Design patterns

​Layered architecture

​Dependency injection

​Repository pattern

​Agent-based architecture

​Async/await throughout

​Request flow

​Project structure

Build docs developers (and LLMs) love

The four layers

Core components

`ODAPIApplication` — `api.py`

`ChatService` — `services/chat_service.py`

`AuthService` — `services/auth_service.py`

`ConnectionManager` — `websocket/connection_manager.py`

`WebSocketHandler` — `websocket/handlers.py`

`Orchestrator` — `connectors/orchestrator.py`

Design patterns

Layered architecture

Dependency injection

Repository pattern

Agent-based architecture

Async/await throughout

Request flow

Project structure