Skip to main content
GitaChat generates two types of commentary for each search result: traditional commentary from classical sources and AI-generated contextual commentary that directly addresses the user’s question. This dual approach provides both scholarly depth and personal relevance.

Overview

The AI commentary system enhances the user experience by bridging ancient wisdom with modern questions.

Traditional Commentary

Pre-existing scholarly interpretations stored as metadata in Pinecone vectors

Contextual Commentary

AI-generated response tailored to the user’s specific question using GPT-4o-mini

Hybrid Display

Both commentaries shown side-by-side, allowing users to choose their preferred depth

Fallback Strategy

If AI generation fails, falls back to traditional commentary to ensure reliability

OpenAI Integration

GitaChat uses OpenAI’s GPT-4o-mini model for fast, cost-effective commentary generation.

Client Configuration

OpenAI Client Setup

Initialization (backend/clients.py:20-21):
from openai import OpenAI

# OpenAI client with timeout
openai_client = OpenAI(api_key=GPT_KEY, timeout=30.0)
Configuration Details:
  • API Key: Loaded from GPT_KEY environment variable
  • Timeout: 30 seconds to prevent hanging requests
  • Model: gpt-4o-mini (fast, affordable, high quality)
Why GPT-4o-mini?:
  • 60% cheaper than GPT-4
  • 2x faster response times
  • Sufficient quality for commentary generation
  • Better instruction following than GPT-3.5

Environment Configuration

Required Environment Variable (backend/config.py:22):
GPT_KEY = os.getenv("GPT_KEY")

if not GPT_KEY:
    raise ValueError("GPT_KEY environment variable is required")
Security Best Practices:
  • Never commit API keys to version control
  • Use .env file for local development
  • Use platform secrets for production (Railway, Render, etc.)
  • Rotate keys regularly

Contextual Commentary Generation

The core AI feature that makes GitaChat’s responses feel personal and relevant.

Implementation

Generate Contextual Commentary Function

Function Signature (backend/utils.py:34):
def generate_contextual_commentary(query: str, verse: dict) -> str:
    """
    Generate commentary that specifically addresses the user's question.

    Args:
        query: The user's original question
        verse: Dict with chapter, verse, translation, and optionally full_commentary/summarized_commentary

    Returns:
        Contextual commentary string tailored to the user's question
    """
Complete Implementation (backend/utils.py:34-68):
def generate_contextual_commentary(query: str, verse: dict) -> str:
    """
    Generate commentary that specifically addresses the user's question.

    Args:
        query: The user's original question
        verse: Dict with chapter, verse, translation, and optionally full_commentary/summarized_commentary

    Returns:
        Contextual commentary string tailored to the user's question
    """
    # Get available commentary for context
    commentary_context = verse.get("full_commentary") or verse.get("summarized_commentary") or ""
    if commentary_context:
        commentary_context = f"\n\nTraditional commentary for context:\n{commentary_context[:1500]}"

    prompt = f"""The user asked: "{query}"

The most relevant verse from the Bhagavad Gita is Chapter {verse['chapter']}, Verse {verse['verse']}:
"{verse['translation']}"{commentary_context}

Write a 2-3 paragraph response that:
1. Explains how this verse directly addresses their situation or question
2. Draws practical wisdom they can apply to their life
3. Maintains a warm, thoughtful tone without being preachy

Vary your opening - don't start with "This verse...". Keep it concise but meaningful."""

    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=500,
        temperature=0.7,
    )
    return response.choices[0].message.content.strip()

Prompt Engineering

The quality of AI commentary depends heavily on prompt design. GitaChat uses a structured approach.

Prompt Structure

Components (backend/utils.py:50-60):
  1. User Query Context:
prompt = f"""The user asked: "{query}"
  • Gives AI the exact user question
  • Ensures response stays relevant
  • Anchors the commentary to user’s needs
  1. Verse Information:
The most relevant verse from the Bhagavad Gita is Chapter {verse['chapter']}, Verse {verse['verse']}:
"{verse['translation']}"{commentary_context}
  • Provides verse text for interpretation
  • Includes chapter/verse for reference
  • Adds traditional commentary as context (up to 1500 chars)
  1. Response Instructions:
Write a 2-3 paragraph response that:
1. Explains how this verse directly addresses their situation or question
2. Draws practical wisdom they can apply to their life
3. Maintains a warm, thoughtful tone without being preachy

Vary your opening - don't start with "This verse...". Keep it concise but meaningful.
  • Specifies desired length and structure
  • Emphasizes practical application
  • Sets tone expectations
  • Prevents repetitive openings

Prompt Design Principles

Key Strategies:1. Context Preservation:
  • User query quoted verbatim
  • Verse translation included in full
  • Traditional commentary provides scholarly grounding
2. Output Constraints:
  • 2-3 paragraphs keeps response digestible
  • 500 token limit prevents rambling
  • “Concise but meaningful” balances depth and brevity
3. Tone Calibration:
  • “Warm, thoughtful” creates approachable voice
  • “Without being preachy” avoids condescension
  • “Practical wisdom” focuses on applicability
4. Variation Techniques:
  • “Vary your opening” prevents formulaic responses
  • “Don’t start with ‘This verse…’” forces creativity
  • Results in more engaging, natural commentary

Model Parameters

OpenAI API Parameters (backend/utils.py:62-67):
response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": prompt}],
    max_tokens=500,
    temperature=0.7,
)
Parameter Choices:
  • model: gpt-4o-mini
    • Balance of quality, speed, and cost
    • Good instruction following
    • Appropriate for creative writing
  • max_tokens: 500
    • ~300-400 words
    • Fits 2-3 paragraphs comfortably
    • Prevents excessive generation costs
  • temperature: 0.7
    • Balanced creativity and consistency
    • 0.0 = deterministic (too rigid)
    • 1.0 = maximum creativity (too unpredictable)
    • 0.7 = natural, varied, but coherent
  • messages: Single user message
    • No system prompt (instructions in user message)
    • Simpler for this use case
    • Reduces token usage

Integration with Search Pipeline

Contextual commentary generation is triggered after vector search completes.

API Endpoint Flow

Query Endpoint (backend/main.py:115-143):
@app.post("/api/query", response_model=dict)
@limiter.limit("30/minute")
async def query_gita(request: Request, query: Query) -> dict:
    """
    Query the Gita with the provided query string(s).
    Returns verse with contextual commentary tailored to the user's question.
    """
    try:
        from model import match
        from utils import generate_contextual_commentary

        result = match(query.query)
        if not result:
            raise HTTPException(status_code=404, detail="No matches found")

        # Generate contextual commentary that addresses the user's specific question
        try:
            contextual = generate_contextual_commentary(query.query, result)
            result["summarized_commentary"] = contextual
        except Exception as e:
            # Fall back to pre-computed summary if OpenAI fails
            logging.warning(f"Contextual commentary failed, using fallback: {e}")

        return {"status": "success", "data": result}
    except HTTPException:
        raise
    except Exception as e:
        logging.error(f"Query error: {type(e).__name__}: {e}")
        raise HTTPException(status_code=500, detail="Internal Server Error")
Processing Steps:
  1. Vector Search: match() finds relevant verse
  2. AI Generation: generate_contextual_commentary() creates custom response
  3. Override: Replaces pre-computed summary with contextual commentary
  4. Fallback: If AI fails, keeps original summary
  5. Return: Verse with both traditional and contextual commentary

Error Handling Strategy

Graceful Degradation (backend/main.py:131-136):
# Generate contextual commentary that addresses the user's specific question
try:
    contextual = generate_contextual_commentary(query.query, result)
    result["summarized_commentary"] = contextual
except Exception as e:
    # Fall back to pre-computed summary if OpenAI fails
    logging.warning(f"Contextual commentary failed, using fallback: {e}")
Why This Matters:
  • Reliability: App never fails due to OpenAI issues
  • User Experience: Always get a response, even if not customized
  • Debugging: Logs help diagnose OpenAI problems
  • Cost Control: Prevents retry loops on persistent failures
Common Failure Scenarios:
  • OpenAI API timeout (>30s)
  • Rate limit exceeded
  • API key invalid or expired
  • Network connectivity issues
  • Model availability problems

Traditional Commentary Summarization

In addition to contextual commentary, GitaChat can summarize long traditional commentaries.

Summarize Function

Purpose: Condense lengthy traditional commentary for quick readingImplementation (backend/utils.py:12-31):
def summarize(commentary_text: str) -> str:
    """Generate a summary of the commentary using GPT-4o-mini."""
    if not commentary_text or len(commentary_text) < 10:
        return ""

    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant that summarizes text concisely but completely.",
            },
            {
                "role": "user",
                "content": f"Summarize the following commentary: {commentary_text}",
            },
        ],
        max_tokens=500,
    )
    return response.choices[0].message.content.strip()
Use Cases:
  • Pre-computing summaries for all verses
  • Batch processing traditional commentary
  • Offline summary generation
When Used:
  • During initial data upload to Pinecone
  • When traditional commentary is updated
  • For maintenance/migration scripts

Summarization Strategy

System Prompt (backend/utils.py:20-22):
{
    "role": "system",
    "content": "You are a helpful assistant that summarizes text concisely but completely.",
}
Key Characteristics:
  • Concise: Removes verbose explanations
  • Complete: Preserves key insights
  • Faithful: Doesn’t add interpretation
  • Neutral: Maintains original tone
Difference from Contextual Commentary:
  • Summarization: Condenses existing text
  • Contextual: Creates new, query-specific response
  • Summarization: Used for pre-computation
  • Contextual: Generated on-demand per query

Frontend Display

The frontend shows both commentary types in an intuitive dual-panel layout.

Dual Commentary Component

Component Usage (frontend/app/page.tsx:175-178):
<DualCommentary
  contextual={resultData.summarized_commentary}
  traditional={resultData.full_commentary}
/>
User Experience:
  • Contextual Commentary: Shown first (most relevant to query)
  • Traditional Commentary: Available in expandable section
  • Toggle: Users can switch between or view both
  • Mobile: Stacked layout on small screens
Benefits:
  • Immediate relevance (contextual first)
  • Scholarly depth (traditional available)
  • User control (choose preferred style)
  • Space efficiency (expandable sections)

Commentary Types in Frontend

Type Definitions (frontend/app/lib/types.ts:14-21):
export interface VerseData {
  chapter: number;
  verse: number;
  translation: string;
  summarized_commentary: string;  // Contextual or pre-computed
  full_commentary?: string;        // Traditional (optional)
  related?: RelatedVerse[];
}
Field Usage:
  • summarized_commentary: AI-generated contextual response (query endpoint)
  • full_commentary: Traditional scholarly commentary (optional, expandable)
  • Backend replaces summarized_commentary with contextual version on search

Performance & Cost Analysis

Latency Impact

Commentary Generation Time:
  • GPT-4o-mini average: 1-3 seconds
  • Vector search: 60-150ms
  • Total query time: 1.1-3.2 seconds
Optimization Strategies:
  • Parallel processing (search + AI)
  • Timeout protection (30s max)
  • Caching for repeated queries
  • Pre-computed summaries as fallback

Cost Breakdown

Per-Query Costs (GPT-4o-mini):
  • Input tokens: ~300 tokens × 0.00015/1K=0.00015/1K = 0.000045
  • Output tokens: ~400 tokens × 0.0006/1K=0.0006/1K = 0.00024
  • Total per query: ~$0.0003
Monthly Costs (1,000 queries/day):
  • OpenAI: ~$9/month
  • Much cheaper than GPT-4 (~$45/month)

Quality Metrics

Observed Quality:
  • Relevance to query: 90%+
  • Tone appropriateness: 95%+
  • Factual accuracy: 85-90%
  • User satisfaction: High
Quality Factors:
  • Strong prompt engineering
  • Traditional commentary as context
  • Temperature 0.7 balance
  • 500 token limit prevents rambling

Failure Rate

Reliability Metrics:
  • OpenAI uptime: 99.9%
  • Generation success: ~98%
  • Fallback usage: ~2%
Common Failures:
  • Timeout (rare with 30s limit)
  • Rate limiting (handled by app)
  • Model overload (auto-retry)

Advanced Techniques

Context Window Management

Traditional Commentary Truncation (backend/utils.py:46-48):
commentary_context = verse.get("full_commentary") or verse.get("summarized_commentary") or ""
if commentary_context:
    commentary_context = f"\n\nTraditional commentary for context:\n{commentary_context[:1500]}"
Why Truncate at 1500 Characters?:
  • Balances context with token usage
  • ~300-400 tokens of traditional commentary
  • Leaves room for query, verse, and instructions
  • Prevents hitting max context length
  • Reduces input costs
Alternative Approaches:
  • Semantic truncation (keep most relevant sentences)
  • Summary-of-summary (two-stage compression)
  • Dynamic truncation based on query length

Prompt Variation

Preventing Repetitive Responses:
Vary your opening - don't start with "This verse...". Keep it concise but meaningful.
Impact:
  • Reduces formulaic responses
  • Increases engagement
  • Feels more natural and human
  • Users less likely to notice AI generation
Future Enhancements:
  • Rotate between multiple prompt templates
  • User preference for formal vs. casual tone
  • Question-type specific prompts (advice, explanation, inspiration)

Caching Strategy (Future)

Potential Optimization: Cache common queriesImplementation Concept:
# Hash query + verse ID
cache_key = f"{hash(query)}:{verse['chapter']}:{verse['verse']}"

# Check Redis cache
if cached := redis.get(cache_key):
    return cached

# Generate and cache (TTL: 7 days)
commentary = generate_contextual_commentary(query, verse)
redis.setex(cache_key, 604800, commentary)
return commentary
Benefits:
  • Instant responses for repeat queries
  • 50-70% cost reduction
  • Lower latency
  • Reduced OpenAI API load

Multi-Language Support (Future)

Potential Feature: Generate commentary in user’s languageImplementation Ideas:
# Detect user language
user_lang = detect_language(query)

# Generate in English first (best quality)
english_commentary = generate_contextual_commentary(query, verse)

# Translate if needed
if user_lang != "en":
    commentary = translate(english_commentary, target=user_lang)
else:
    commentary = english_commentary
Challenges:
  • Translation quality
  • Sanskrit term preservation
  • Cultural context nuances

Troubleshooting

Poor Commentary Quality

Symptoms: Generic, irrelevant, or off-topic responsesDebugging:
  1. Check if traditional commentary being passed
  2. Verify query preserved correctly
  3. Review temperature setting
  4. Inspect prompt structure
  5. Test with different queries
Solutions:
  • Adjust prompt instructions
  • Increase/decrease temperature
  • Add more examples in prompt
  • Use system prompt for tone

Slow Response Times

Symptoms: >5 second latencyCommon Causes:
  • OpenAI model overload
  • Network latency
  • Large traditional commentary
  • Multiple retries
Solutions:
  • Check OpenAI status page
  • Reduce commentary truncation length
  • Implement caching
  • Add timeout warnings to user

API Errors

Common Error Codes:
  • 401: Invalid API key
  • 429: Rate limit exceeded
  • 500: OpenAI server error
  • 503: Model unavailable
Resolution:
# Check logs for specific error
logging.warning(f"Contextual commentary failed, using fallback: {e}")

# Verify API key
openai_client.api_key

# Check rate limits in OpenAI dashboard

Fallback Always Triggered

Symptoms: Never generating contextual commentaryDiagnostic Steps:
  1. Check GPT_KEY environment variable set
  2. Verify OpenAI client timeout (30s)
  3. Test API key with direct OpenAI call
  4. Review application logs
  5. Check network connectivity
Quick Test:
from clients import openai_client
response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Test"}],
    max_tokens=10
)
print(response.choices[0].message.content)

Best Practices

Prompt Engineering

  • Test prompts with diverse queries
  • Include negative constraints (“don’t…”)
  • Specify tone and length clearly
  • Provide traditional commentary as context
  • Iterate based on user feedback

Error Handling

  • Always implement fallback strategy
  • Log errors for debugging
  • Never expose API keys in errors
  • Timeout protection on all AI calls
  • Graceful degradation for users

Cost Management

  • Use GPT-4o-mini over GPT-4
  • Set appropriate max_tokens limits
  • Consider caching frequent queries
  • Monitor usage in OpenAI dashboard
  • Set spending alerts

Quality Assurance

  • Manually review sample outputs
  • Track user feedback
  • Monitor fallback usage rate
  • Test with edge cases
  • A/B test prompt variations

Build docs developers (and LLMs) love