Overview
Echoes of the Past integrates three AI services to power real-time voice conversations:
- Vapi AI: Orchestrates voice conversations with WebRTC
- ElevenLabs: Generates realistic voice synthesis for historical figures
- OpenAI: Provides language understanding and structured feedback generation
Vapi AI Integration
Setup
Vapi is initialized as a singleton client in lib/vapi.ts:
import Vapi from '@vapi-ai/web'
export const vapi = new Vapi(process.env.NEXT_PUBLIC_VAPI_WEB_TOKEN)
Assistant Configuration
Each conversation creates a transient Vapi assistant with character-specific configuration:
const assistant: CreateAssistantDTO = {
name: character.name,
firstMessage: generateCallFirstMessage(character),
model: {
provider: 'openai',
model: 'gpt-3.5-turbo',
temperature: 0.7,
messages: [
{
role: 'system',
content: systemPrompt // Character-specific prompt
}
]
},
voice: {
provider: '11labs',
voiceId: character.voiceId,
stability: 0.4,
similarityBoost: 0.8,
speed: 1,
style: 0.5,
useSpeakerBoost: true
},
messagePlan: {
idleMessages: [
'If you have a question, feel free to ask',
'Are you there?',
'What are you thinking? I can help you!!',
"I'm here whenever you're ready to continue"
],
idleTimeoutSeconds: 15,
idleMessageMaxSpokenCount: 3,
idleMessageResetCountOnUserSpeechEnabled: true
},
backgroundDenoisingEnabled: true
}
Voice Parameters Explained
stability
number (0-1)
default:"0.4"
Lower values add more variation and emotion; higher values are more consistent but less expressive
similarityBoost
number (0-1)
default:"0.8"
How closely the output matches the original voice sample. Higher values increase similarity
Speech rate multiplier. 1.0 is normal speed
style
number (0-1)
default:"0.5"
Voice expressiveness. Higher values add more dramatic intonation
Enhances voice clarity, especially in noisy environments
Event Handling
The useVapi hook manages real-time events:
vapi.on('speech-start', () => setIsSpeechActive(true))
vapi.on('speech-end', () => setIsSpeechActive(false))
vapi.on('call-start', () => setCallStatus(CALL_STATUS.ACTIVE))
vapi.on('call-end', () => setCallStatus(CALL_STATUS.INACTIVE))
vapi.on('volume-level', (volume) => setAudioLevel(volume))
vapi.on('message', (message) => {
if (message.type === 'transcript' && message.transcriptType === 'partial') {
setActiveTranscript(message) // Real-time transcription
} else {
setMessages(prev => [...prev, message]) // Final messages
}
})
vapi.on('error', (e) => console.error(e))
Programmatic Question Injection
For quiz mode, questions can be injected into the conversation:
vapi.send({
type: 'add-message',
message: {
role: 'system',
content: `The user has pressed a button for you to ask him ${question}.`
}
})
ElevenLabs Integration
Setup
ElevenLabs client is configured in lib/elevenlabs.ts:
import { ElevenLabsClient } from '@elevenlabs/elevenlabs-js'
export const elevenlabs = new ElevenLabsClient({
apiKey: process.env.ELEVEN_LABS_API_KEY
})
Voice Selection
Each historical figure has a pre-selected ElevenLabs voice ID stored in the historicalFigures.voiceId field. Voices are chosen to match:
- Historical accuracy (e.g., British accent for Isaac Newton)
- Gender and age of the figure
- Tone and speaking style appropriate to their personality
Integration with Vapi
ElevenLabs is used as the voice provider in Vapi’s assistant configuration. Vapi handles the text-to-speech conversion automatically using the specified voiceId.
OpenAI Integration
Setup
OpenAI client is configured in lib/ai.ts:
import OpenAI from 'openai'
export const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
dangerouslyAllowBrowser: true // For client-side usage
})
Dual Usage
- Via Vapi (GPT-3.5 Turbo): Real-time conversation processing
- Direct API (GPT-4 Turbo): Structured feedback generation
Feedback Generation
Implemented in features/call/lib/generate-feedback.ts:
const completion = await openai.chat.completions.create({
model: 'gpt-4-turbo-preview',
messages: [
{
role: 'system',
content: `You are a professional interviewer analyzing a mock interview.
You MUST return your response as a JSON object with the following structure:
{
"totalScore": number,
"categoryScores": [
{
"name": "Communication Skills",
"score": number,
"comment": string
},
// ... 4 more categories
],
"strengths": string[],
"areasForImprovement": string[],
"finalAssessment": string
}`
},
{
role: 'user',
content: `Analyze this mock interview and provide detailed feedback.
Transcript:
${formattedTranscript}
Score the candidate from 0 to 100 in these categories:
- Communication Skills
- Technical Knowledge
- Problem Solving
- Cultural Fit
- Confidence and Clarity`
}
],
response_format: { type: 'json_object' }
})
Rate Limiting
Feedback generation is limited to prevent API abuse:
const rateLimit = await redis.incr(`feedback-rate-limit:${user.id}`)
if (rateLimit > 10) {
return {
data: null,
error: 'You have reached the maximum number of feedback per day'
}
}
Response Validation
OpenAI responses are validated with Zod schemas:
import { feedbackSchema } from '@/schema'
const result = completion.choices[0].message.content
const parsedResult = feedbackSchema.parse(JSON.parse(result))
Prompt Engineering
Conversation Prompts
Defined in lib/prompt.ts, prompts are structured for optimal performance:
Structure
export const generateCallPrompt = (character: HistoricalFigure) => `
[Identity]
You are ${character.name}, a famous historical personality, speaking directly to the user in the present day.
[Style]
- Speak in a warm, informal, and conversational tone
- Use first-person perspective ("I")
- Sprinkle in era-appropriate humor and metaphors
- Add natural speech elements: pauses ("..."), hesitations ("uh", "well")
- Never sound robotic or overly formal
[Response Guidelines]
- Stay true to your biography, era, and cultural context
- Share personal anecdotes and lesser-known facts
- Handle criticism with reflection and grace
- Never say you are an AI or mention tools/functions
- Keep responses 1-3 paragraphs
[Configuration]
Historical Figure: ${character.name}
Time Period: ${formatDate(character.dateOfBirth)} to ${formatDate(character.dateOfDeath)}
Personality Traits: ${character.description}
Key Achievements: ${character.notableWork}
Signature Themes: ${character.category}
`
Key Design Principles
- Clear Role Definition: Establishes identity and self-awareness
- Style Guidelines: Natural speech patterns with emotional depth
- Behavioral Constraints: Maintains immersion, avoids AI references
- Context Injection: Historical dates, achievements, and personality
- Category-Specific Themes: Domain imagery (e.g., apples for Newton)
First Message Generation
export function generateCallFirstMessage(character: HistoricalFigure): string {
const bio = character.description.replace(/\s*\(\d{4}[-–]\d{4}\)\s*$/, '')
const work = character.notableWork?.split(',')[0]?.trim() || ''
let intro = `Hey, I'm ${character.name}! ${bio}`
if (work) {
intro += ` You might know me from "${work}."`
}
intro += ` Let's chat—ask me anything!`
return intro.trim()
}
Quiz Prompts
export const generateQuizPrompt = (character: HistoricalFigure, questions: string[]) => `
[Identity]
You are ${character.name}, hosting a lively, in-character quiz about your life.
[Style]
- Speak casually and cheekily with your unmistakable personality
- Use era-appropriate humor
- Include natural speech patterns but don't overdo it
[Response Guidelines]
- Ask exactly ${questions.length} questions, one at a time
- Keep transitions short and natural
- Before the final question: "Here comes the final question—brace yourself!"
- After correct answers: confirm confidently and move on
- After incorrect answers: give one short hint, then reveal if still wrong
- After final question: give score summary and end warmly
[Questions]
${questions.map((q, i) => `${i + 1}. ${q}`).join('\n')}
`
Category-Specific Hooks
Fun introductions tailored by category:
const funnyHooks: Record<Enums<'categories'>, string> = {
scientists: `Hope you've got your thinking cap on—preferably one with equations on it.`,
philosophers: `Ready to question everything, including your last answer?`,
artists: `Let's paint the quiz red—or at least try not to mess it up.`,
leaders: `Command your thoughts wisely, the quiz battlefield awaits.`,
others: `Let's see if you're smarter than you look. 😉`
}
Error Handling
Vapi Errors
vapi.on('error', (e: Error) => {
setCallStatus(CALL_STATUS.INACTIVE)
console.error('Vapi error:', e)
})
OpenAI Errors
try {
const completion = await openai.chat.completions.create({...})
const result = completion.choices[0].message.content
if (!result) throw new Error('No feedback generated')
} catch (error) {
return { data: null, error: 'Failed to generate feedback' }
}
Environment Validation
All AI services validate environment variables at initialization:
if (!process.env.NEXT_PUBLIC_VAPI_WEB_TOKEN) {
throw new Error('NEXT_PUBLIC_VAPI_WEB_TOKEN environment variable is required')
}
Model Selection
- GPT-3.5 Turbo: Faster, cheaper for real-time conversation
- GPT-4 Turbo: More accurate for complex structured feedback
WebRTC Benefits
- Direct peer-to-peer audio reduces latency
- No server-side audio processing required
- Automatic bandwidth adaptation
Caching Strategy
- Vapi assistants are ephemeral (not cached)
- Character data cached by TanStack Query
- Voice IDs stored in database for instant retrieval
Cost Optimization
- Rate Limiting: 10 feedback requests per user per day
- Model Selection: GPT-3.5 for conversation, GPT-4 only for feedback
- Prompt Efficiency: Concise system prompts reduce token usage
- Voice Caching: ElevenLabs voice clones used across all conversations