Architecture

Stanzo’s architecture is built around three core services that process debate audio in real-time:

Deepgram for live transcription with speaker diarization
Gemini for context-aware claim extraction
Perplexity for fact-checking with primary sources

All data flows through Convex, which handles database storage, reactive subscriptions, and async job scheduling.

System overview

┌─────────────┐
│   Browser   │
│  Microphone │
└──────┬──────┘
       │ MediaStream (WebRTC)
       ▼
┌─────────────────┐
│    Deepgram     │  Live WebSocket
│ nova-3 + diarize│  (interim + final transcripts)
└────────┬────────┘
         │
         ▼
┌──────────────────────────────────────────┐
│              Convex Backend              │
│                                          │
│  ┌─────────────────────────────────┐    │
│  │   transcriptChunks table        │    │
│  │   (speaker, text, timestamps)   │    │
│  └──────────┬──────────────────────┘    │
│             │                            │
│             ▼                            │
│  ┌─────────────────────────────────┐    │
│  │  Gemini 2.5 Flash Extraction    │    │
│  │  (multi-turn conversation)      │    │
│  └──────────┬──────────────────────┘    │
│             │                            │
│             ▼                            │
│  ┌─────────────────────────────────┐    │
│  │      claims table (pending)     │    │
│  └──────────┬──────────────────────┘    │
│             │                            │
│             ▼                            │
│  ┌─────────────────────────────────┐    │
│  │   Perplexity Sonar Fact-Check   │    │
│  │   (async scheduled action)      │    │
│  └──────────┬──────────────────────┘    │
│             │                            │
│             ▼                            │
│  ┌─────────────────────────────────┐    │
│  │   claims table (verdict + src)  │    │
│  └──────────┬──────────────────────┘    │
│             │                            │
└─────────────┼──────────────────────────┘
              │ Reactive subscription
              ▼
         ┌─────────┐
         │ React UI│
         │ (Next.js│
         └─────────┘

Data flow

1. Live transcription

When a debate starts, the browser opens a WebSocket connection to Deepgram’s live API. Client-side streaming (src/hooks/useDeepgram.ts:33-80):

const start = async (debateId) => {
  // Get temporary Deepgram token from Convex
  const { token } = await mintToken()

  // Request microphone access
  const stream = await navigator.mediaDevices.getUserMedia({
    audio: {
      echoCancellation: true,
      noiseSuppression: true,
    },
  })

  // Create Deepgram connection
  const client = createClient({ accessToken: token })
  const connection = client.listen.live({
    model: "nova-3",
    language: "en",
    smart_format: true,
    punctuate: true,
    diarize: true,              // Speaker separation
    interim_results: true,      // Show live partial results
    utterance_end_ms: 1500,     // Trigger after 1.5s silence
  })

  // Pipe audio chunks to Deepgram
  const recorder = new MediaRecorder(stream, {
    mimeType: "audio/webm;codecs=opus",
  })
  recorder.ondataavailable = (event) => {
    if (event.data.size > 0 && connection.getReadyState() === 1) {
      connection.send(event.data)
    }
  }
  recorder.start(250)  // Send chunks every 250ms
}

Transcript processing (src/hooks/useDeepgram.ts:82-112): Deepgram sends two types of events:

Interim results: Partial transcriptions shown as gray text while someone is speaking
Final results: Confirmed transcripts saved to the database

connection.on(LiveTranscriptionEvents.Transcript, async (data) => {
  const alt = data.channel.alternatives[0]
  if (!alt?.transcript?.trim()) return

  const { transcript } = alt
  const speaker = alt.words[0]?.speaker ?? 0  // 0 or 1 from diarization
  const startTime = data.start
  const duration = data.duration

  // Show interim results in UI
  if (!data.is_final) {
    setInterim({ text: transcript, speaker })
    return
  }

  // Save final transcript to Convex
  await insertChunk({
    debateId,
    speaker: speaker === 0 ? 0 : 1,
    text: transcript,
    startTime,
    endTime: startTime + duration,
  })

  setInterim(null)
})

// Trigger claim extraction after 1.5s silence
connection.on(LiveTranscriptionEvents.UtteranceEnd, () => {
  triggerExtraction({ debateId })
})

Database schema (convex/schema.ts:19-29):

transcriptChunks: defineTable({
  debateId: v.id("debates"),
  speaker: v.union(v.literal(0), v.literal(1)),
  text: v.string(),
  startTime: v.number(),
  endTime: v.number(),
  processedForClaims: v.boolean(),  // Prevents duplicate extraction
})
  .index("by_debate", ["debateId"])
  .index("by_debate_unprocessed", ["debateId", "processedForClaims"])

2. Multi-turn claim extraction

Every utterance boundary (1.5 seconds of silence) triggers a Gemini extraction session. Why multi-turn sessions matter (convex/claimExtraction.ts:78-96): Stanzo maintains the full conversation history with Gemini for each debate. This allows the model to:

Avoid duplicates: Remember claims already extracted
Resolve context: Understand “that number” refers to a statistic from 2 minutes ago
Track continuity: Know when a speaker is elaborating vs. making a new claim

function buildSystemPrompt(speakerA: string, speakerB: string): string {
  return `You are a factual claim extractor for a live debate between ${speakerA} (speaker 0) and ${speakerB} (speaker 1).

Each turn, I provide a new transcript segment. You have the full conversation history.

Rules:
- ONLY extract claims from the NEW segment in my latest message
- Do NOT re-extract claims from previous turns
- Extract specific, verifiable factual claims (statistics, dates, named facts, causal claims)
- Extract the factual core when mixed with opinion
- Ignore purely opinion/prediction/subjective statements
- Use context to resolve pronouns and references

Output: JSONL, one object per line:
- speaker: 0 for ${speakerA}, 1 for ${speakerB}
- claimText: concise factual claim
- originalTranscriptExcerpt: quote from the new segment

If no factual claims, output: NO_CLAIMS
No markdown, no explanation, no array brackets.`
}

Extraction flow (convex/claimExtraction.ts:99-168):

export const extract = internalAction({
  handler: async (ctx, { debateId }) => {
    // Get unprocessed transcript chunks
    const chunks = await ctx.runQuery(
      internal.transcriptChunks.getUnprocessed,
      { debateId }
    )
    if (chunks.length === 0) return null

    // Mark as processed BEFORE calling LLM to prevent race conditions
    await ctx.runMutation(internal.transcriptChunks.markProcessed, {
      chunkIds: chunks.map((c) => c._id),
    })

    // Load existing conversation history from extractionSessions table
    const session = await ctx.runQuery(
      internal.extractionSessions.getByDebate,
      { debateId }
    )
    const existingMessages = session?.messages ?? []

    // Build new user message from chunks
    const newUserMessage = chunks
      .map((c) => `[${speakerNames[c.speaker]}]: ${c.text}`)
      .join("\n")

    const messages = [
      ...existingMessages,
      { role: "user", content: newUserMessage },
    ]

    // Stream claims from Gemini
    await streamClaims(apiKey, systemPrompt, messages, async (claim) => {
      // Save each claim as it's parsed
      await ctx.runMutation(internal.claims.saveClaim, {
        debateId,
        speaker: claim.speaker,
        claimText: claim.claimText,
        originalTranscriptExcerpt: claim.originalTranscriptExcerpt,
      })
    })

    // Persist updated conversation history
    await ctx.runMutation(internal.extractionSessions.upsert, {
      debateId,
      messages: [...messages, { role: "model", content: result }],
    })
  },
})

JSONL streaming (convex/claimExtraction.ts:30-75): Claims are parsed line-by-line from Gemini’s response, so they appear in the UI incrementally:

for await (const chunk of stream) {
  buffer += chunk.text

  // Process complete lines
  while ((newlineIdx = buffer.indexOf("\n")) !== -1) {
    const line = buffer.slice(0, newlineIdx).trim()
    buffer = buffer.slice(newlineIdx + 1)
    
    const claim = parseClaim(line)  // Parse JSON from line
    if (claim) await onClaim(claim)  // Save to database immediately
  }
}

Database schema (convex/schema.ts:31-52):

claims: defineTable({
  debateId: v.id("debates"),
  speaker: v.union(v.literal(0), v.literal(1)),
  claimText: v.string(),
  originalTranscriptExcerpt: v.string(),
  status: v.union(
    v.literal("pending"),
    v.literal("checking"),
    v.literal("true"),
    v.literal("false"),
    v.literal("mixed"),
    v.literal("unverifiable"),
  ),
  verdict: v.optional(v.string()),
  correction: v.optional(v.string()),
  sources: v.optional(v.array(v.string())),
  extractedAt: v.number(),
  checkedAt: v.optional(v.number()),
})
  .index("by_debate", ["debateId"])
  .index("by_status", ["status"])

Extraction sessions (convex/schema.ts:54-62):

extractionSessions: defineTable({
  debateId: v.id("debates"),
  messages: v.array(
    v.object({
      role: v.union(v.literal("user"), v.literal("model")),
      content: v.string(),
    }),
  ),
}).index("by_debate", ["debateId"])

3. Asynchronous fact-checking

Every time a claim is saved with pending status, Convex triggers a scheduled action to fact-check it with Perplexity. Fact-check flow (convex/factCheck.ts:93-131):

export const check = internalAction({
  handler: async (ctx, { claimId }) => {
    // Update to "checking" status
    await ctx.runMutation(internal.claims.updateStatus, {
      claimId,
      status: "checking",
    })

    // Fetch claim details
    const claim = await ctx.runQuery(internal.claims.getById, { claimId })
    if (!claim) return null

    // Call Perplexity Sonar
    const factCheck = await callPerplexity(apiKey, claim.claimText)

    // Update claim with results
    await ctx.runMutation(internal.claims.updateStatus, {
      claimId,
      status: factCheck.status,      // true/false/mixed/unverifiable
      verdict: factCheck.verdict,    // Explanation text
      correction: factCheck.correction,  // Corrected info if needed
      sources: factCheck.citations,  // Array of URLs
    })
  },
})

Perplexity integration (convex/factCheck.ts:39-91): The fact-checker uses the Effect library for retry logic and timeouts:

const callPerplexity = (apiKey: string, claimText: string) =>
  Effect.gen(function* () {
    const client = new Perplexity({ apiKey })

    const response = yield* Effect.tryPromise({
      try: () =>
        client.chat.completions.create({
          model: "sonar",
          messages: [
            {
              role: "system",
              content:
                "You are a fact-checker. Evaluate the following claim and respond with ONLY a JSON object containing: status (one of: true, false, mixed, unverifiable), verdict (brief explanation), correction (if false or mixed, the correct information; otherwise null). Keep verdict and correction to ~30 words each.",
            },
            {
              role: "user",
              content: `Fact-check this claim: "${claimText}"`,
            },
          ],
        }),
      catch: (e) => new PerplexityApiError({ message: String(e) }),
    })

    // Parse response and extract citations
    const content = response.choices?.[0]?.message?.content
    const citations = (response.citations ?? []).map(String)

    return { status, verdict, correction, citations }
  }).pipe(
    Effect.retry({
      schedule: Schedule.exponential(Duration.seconds(1)).pipe(
        Schedule.intersect(Schedule.recurs(3)),  // Max 3 retries
      ),
      while: (e) => e instanceof PerplexityApiError,
    }),
    Effect.timeout(Duration.seconds(30)),  // 30s timeout
  )

4. Reactive UI updates

Convex powers the UI with reactive subscriptions. When a claim’s status changes in the database, the React component re-renders automatically. Query subscriptions (src/app/debates/new/page.tsx:20-23):

const debateArgs = debateId ? { debateId } : ("skip" as const)
const debate = useQuery(api.debates.get, debateArgs)
const chunks = useQuery(api.transcriptChunks.listByDebate, debateArgs)
const claims = useQuery(api.claims.listByDebate, debateArgs)

When claims updates (e.g., a claim goes from pending → true), React automatically re-renders the ClaimsSidebar component with the new data. No polling required.

Tech stack details

Frontend: Next.js 16 + React 19

App Router: File-based routing with layouts
Server Components: Pre-render static pages like the landing page
Client Components: Interactive debate UI with real-time updates
Tailwind CSS 4: Utility-first styling with custom design system

Key dependencies (package.json:13-26):

{
  "dependencies": {
    "@deepgram/sdk": "4.11.3",
    "@convex-dev/auth": "0.0.91",
    "convex": "1.32.0",
    "next": "16.1.6",
    "react": "19.2.4",
    "@phosphor-icons/react": "2.1.10"
  }
}

Backend: Convex

Convex provides:

Database: Stores debates, transcripts, claims, and extraction sessions
Mutations: Create debates, insert chunks, save claims
Queries: Fetch debates, list claims, get transcripts
Actions: Call external APIs (Gemini, Perplexity)
Scheduling: Trigger fact-checks after claim insertion
Subscriptions: Push live updates to React

Authentication (convex/auth.ts):

import GitHub from "@auth/core/providers/github"
import { convexAuth } from "@convex-dev/auth/server"

export const { auth, signIn, signOut, isAuthenticated } = convexAuth({
  providers: [GitHub],
})

AI services

Service	Model	Purpose	Key Features
Deepgram	nova-3	Live transcription	Speaker diarization, interim results, 1.5s utterance detection
Gemini	2.0 Flash	Claim extraction	Multi-turn conversations, streaming JSONL output, 4096 token limit
Perplexity	Sonar	Fact-checking	Web search, citation extraction, structured JSON responses

Error handling: Effect library

Stanzo uses the Effect library for functional error handling instead of try/catch chains. Benefits:

Retries: Exponential backoff for transient API failures
Timeouts: Prevent hanging requests
Type-safe errors: Structured error types like PerplexityApiError
Composable: Chain operations with .pipe()

Example (convex/factCheck.ts:84-91):

Effect.retry({
  schedule: Schedule.exponential(Duration.seconds(1)).pipe(
    Schedule.intersect(Schedule.recurs(3)),  // 1s, 2s, 4s, then fail
  ),
  while: (e) => e instanceof PerplexityApiError,
}),
Effect.timeout(Duration.seconds(30)),

Key design decisions

Why multi-turn extraction sessions?

Without conversation history, Gemini would:

Re-extract the same claim multiple times
Struggle with pronouns like “he said that number is wrong”
Miss when speakers circle back to earlier topics

By maintaining full context, Stanzo extracts claims accurately even in fast-paced debates with cross-talk and references.

Why JSONL streaming instead of batch extraction?

Streaming claims line-by-line means:

Users see results faster (claims appear as they’re parsed)
Gemini doesn’t have to finish the entire response before saving
Lower perceived latency in the UI

A batch approach would wait for all claims, then insert them at once—creating a longer delay.

Why async fact-checking?

Decoupling extraction from fact-checking prevents slow Perplexity calls from blocking Gemini. If a claim takes 10 seconds to verify, it shouldn’t delay extraction of the next utterance. Convex’s scheduler runs fact-checks in parallel, so multiple claims are verified simultaneously.

Why Convex instead of traditional backend?

Convex removes the need to:

Set up WebSocket infrastructure for real-time updates
Build a job queue for async actions
Write SQL migrations for schema changes
Deploy separate API servers

Everything runs on Convex’s serverless platform, and the React client subscribes to database changes automatically.

Performance characteristics

Transcription latency: ~500ms from speech to transcript appearing in UI (Deepgram’s nova-3 model) Claim extraction latency: 1-3 seconds after utterance boundary (depends on Gemini response time and conversation length) Fact-check latency: 3-10 seconds per claim (Perplexity searches the web and evaluates sources) UI update latency: 100ms from database write to React re-render (Convex reactive subscriptions)

Total time from spoken word to verified claim: 5-15 seconds depending on claim complexity and API response times.

Get Started

Core Features

Technical Guides

Deployment

Architecture

Architecture

System overview

Data flow

1. Live transcription

2. Multi-turn claim extraction

3. Asynchronous fact-checking

4. Reactive UI updates

Tech stack details

Frontend: Next.js 16 + React 19

Backend: Convex

AI services

Error handling: Effect library

Key design decisions

Why multi-turn extraction sessions?

Why JSONL streaming instead of batch extraction?

Why async fact-checking?

Why Convex instead of traditional backend?

Performance characteristics

Next steps

API Reference

Deploy your own

Build docs developers (and LLMs) love

Get Started

Core Features

Technical Guides

Deployment

​Architecture

​System overview

​Data flow

​1. Live transcription

​2. Multi-turn claim extraction

​3. Asynchronous fact-checking

​4. Reactive UI updates

​Tech stack details

​Frontend: Next.js 16 + React 19

​Backend: Convex

​AI services

​Error handling: Effect library

​Key design decisions

​Why multi-turn extraction sessions?

​Why JSONL streaming instead of batch extraction?

​Why async fact-checking?

​Why Convex instead of traditional backend?

​Performance characteristics

​Next steps

API Reference

Deploy your own

Build docs developers (and LLMs) love

Architecture

System overview

Data flow

1. Live transcription

2. Multi-turn claim extraction

3. Asynchronous fact-checking

4. Reactive UI updates

Tech stack details

Frontend: Next.js 16 + React 19

Backend: Convex

AI services

Error handling: Effect library

Key design decisions

Why multi-turn extraction sessions?

Why JSONL streaming instead of batch extraction?

Why async fact-checking?

Why Convex instead of traditional backend?

Performance characteristics

Next steps