AI Voice Translation

Overview

OpenShorts integrates ElevenLabs Dubbing API to translate video audio into 30+ languages while preserving voice characteristics, timing, and emotional tone. The system uses AI voice cloning to match the original speaker’s voice in the target language.

Supported Languages

The translate.py module supports 32 languages:

translate.py

SUPPORTED_LANGUAGES = {
    "en": "English",
    "es": "Spanish",
    "fr": "French",
    "de": "German",
    "it": "Italian",
    "pt": "Portuguese",
    "pl": "Polish",
    "hi": "Hindi",
    "ja": "Japanese",
    "ko": "Korean",
    "zh": "Chinese",
    "ar": "Arabic",
    "ru": "Russian",
    "tr": "Turkish",
    "nl": "Dutch",
    "sv": "Swedish",
    "id": "Indonesian",
    "fil": "Filipino",
    "ms": "Malay",
    "vi": "Vietnamese",
    "th": "Thai",
    "uk": "Ukrainian",
    "el": "Greek",
    "cs": "Czech",
    "fi": "Finnish",
    "ro": "Romanian",
    "da": "Danish",
    "bg": "Bulgarian",
    "hr": "Croatian",
    "sk": "Slovak",
    "ta": "Tamil",
}

def get_supported_languages() -> dict:
    """Return dict of supported language codes and names."""
    return SUPPORTED_LANGUAGES.copy()

Translation Workflow

1. Create Dubbing Project

Upload video to ElevenLabs and initiate dubbing:

translate.py

def create_dubbing_project(
    video_path: str,
    target_language: str,
    api_key: str,
    source_language: Optional[str] = None,
) -> dict:
    """
    Create a new dubbing project with ElevenLabs.
    
    Returns:
        dict with dubbing_id and expected_duration_sec
    """
    url = f"{ELEVENLABS_API_BASE}/dubbing"
    
    headers = {
        "xi-api-key": api_key,
    }
    
    # Prepare form data
    data = {
        "target_lang": target_language,
        "mode": "automatic",
        "num_speakers": "0",  # Auto-detect
        "watermark": "false",
    }
    
    if source_language:
        data["source_lang"] = source_language
    
    # Open and send the video file
    with open(video_path, "rb") as video_file:
        files = {
            "file": (os.path.basename(video_path), video_file, "video/mp4")
        }
        
        with httpx.Client(timeout=300.0) as client:
            response = client.post(url, headers=headers, data=data, files=files)
    
    result = response.json()
    return result  # {'dubbing_id': '...', 'expected_duration_sec': 60}

2. Poll for Completion

Monitor dubbing status with automatic retries:

translate.py

def get_dubbing_status(dubbing_id: str, api_key: str) -> dict:
    """
    Check the status of a dubbing project.
    
    Returns:
        dict with status ('dubbing', 'dubbed', 'failed') and other metadata
    """
    url = f"{ELEVENLABS_API_BASE}/dubbing/{dubbing_id}"
    
    headers = {
        "xi-api-key": api_key,
    }
    
    with httpx.Client(timeout=30.0) as client:
        response = client.get(url, headers=headers)
    
    return response.json()

3. Download Dubbed Video

Retrieve the translated audio:

translate.py

def download_dubbed_video(
    dubbing_id: str,
    target_language: str,
    output_path: str,
    api_key: str
) -> str:
    """
    Download the dubbed video file.
    
    Returns:
        Path to the downloaded file
    """
    url = f"{ELEVENLABS_API_BASE}/dubbing/{dubbing_id}/audio/{target_language}"
    
    headers = {
        "xi-api-key": api_key,
    }
    
    with httpx.Client(timeout=120.0) as client:
        with client.stream("GET", url, headers=headers) as response:
            with open(output_path, "wb") as f:
                for chunk in response.iter_bytes(chunk_size=8192):
                    f.write(chunk)
    
    return output_path

Main Translation Function

The translate_video() function orchestrates the entire workflow:

translate.py

def translate_video(
    video_path: str,
    output_path: str,
    target_language: str,
    api_key: str,
    source_language: Optional[str] = None,
    max_wait_seconds: int = 600,
    poll_interval: int = 5,
) -> str:
    """
    Translate a video to a target language using ElevenLabs dubbing.
    
    This is a blocking call that waits for the dubbing to complete.
    
    Args:
        video_path: Path to input video
        output_path: Path to save translated video
        target_language: Target language code (e.g., 'es', 'fr', 'de')
        api_key: ElevenLabs API key
        source_language: Source language code (auto-detected if None)
        max_wait_seconds: Maximum time to wait for dubbing (default 10 min)
        poll_interval: Seconds between status checks
    
    Returns:
        Path to the translated video
    """
    # Create dubbing project
    project = create_dubbing_project(
        video_path=video_path,
        target_language=target_language,
        api_key=api_key,
        source_language=source_language,
    )
    
    dubbing_id = project["dubbing_id"]
    expected_duration = project.get("expected_duration_sec", 60)
    
    print(f"[ElevenLabs] Dubbing ID: {dubbing_id}, Expected duration: {expected_duration}s")
    
    # Poll for completion
    start_time = time.time()
    while True:
        elapsed = time.time() - start_time
        if elapsed > max_wait_seconds:
            raise Exception(f"Dubbing timed out after {max_wait_seconds} seconds")
        
        status = get_dubbing_status(dubbing_id, api_key)
        current_status = status.get("status", "unknown")
        
        print(f"[ElevenLabs] Status: {current_status} (elapsed: {int(elapsed)}s)")
        
        if current_status == "dubbed":
            # Download the result
            return download_dubbed_video(
                dubbing_id=dubbing_id,
                target_language=target_language,
                output_path=output_path,
                api_key=api_key,
            )
        
        elif current_status == "failed":
            error = status.get("error", "Unknown error")
            raise Exception(f"Dubbing failed: {error}") 
        
        # Still processing, wait and poll again
        time.sleep(poll_interval)

API Integration

The FastAPI endpoint handles translation requests:

app.py

class TranslateRequest(BaseModel):
    job_id: str
    clip_index: int
    target_language: str
    source_language: Optional[str] = None
    input_filename: Optional[str] = None

@app.post("/api/translate")
async def translate_clip(
    req: TranslateRequest,
    x_elevenlabs_key: Optional[str] = Header(None, alias="X-ElevenLabs-Key")
):
    """Translate a video clip to a different language using ElevenLabs dubbing."""
    if not x_elevenlabs_key:
        raise HTTPException(status_code=400, detail="Missing X-ElevenLabs-Key header")
    
    # Resolve input video path
    if req.input_filename:
        filename = os.path.basename(req.input_filename)
    else:
        filename = clip_data.get('video_url', '').split('/')[-1]
    
    input_path = os.path.join(output_dir, filename)
    
    # Output video with language suffix
    base, ext = os.path.splitext(filename)
    output_filename = f"translated_{req.target_language}_{base}{ext}"
    output_path = os.path.join(output_dir, output_filename)
    
    try:
        # Run translation in thread pool (blocking API calls)
        def run_translate():
            return translate_video(
                video_path=input_path,
                output_path=output_path,
                target_language=req.target_language,
                api_key=x_elevenlabs_key,
                source_language=req.source_language,
            )
        
        loop = asyncio.get_event_loop()
        await loop.run_in_executor(None, run_translate)
        
    except Exception as e:
        print(f"❌ Translation Error: {e}")
        raise HTTPException(status_code=500, detail=str(e))
    
    # Update metadata with translated video URL
    return {
        "success": True,
        "new_video_url": f"/videos/{req.job_id}/{output_filename}"
    }

Get Available Languages

Fetch the list of supported languages:

app.py

@app.get("/api/translate/languages")
async def get_languages():
    """Return supported languages for translation."""
    return {"languages": get_supported_languages()}

Example Response:

{
  "languages": {
    "en": "English",
    "es": "Spanish",
    "fr": "French",
    "de": "German",
    "it": "Italian",
    "pt": "Portuguese",
    "ja": "Japanese",
    "ko": "Korean",
    "zh": "Chinese",
    "ar": "Arabic"
    // ... 22 more languages
  }
}

Voice Cloning Feature

ElevenLabs automatically clones the original speaker’s voice:

Automatic Voice Matching: The mode: "automatic" parameter enables AI voice cloning that preserves:

Voice timbre and characteristics
Speaking pace and rhythm
Emotional tone and inflection
Lip sync timing (video only)

data = {
    "target_lang": target_language,
    "mode": "automatic",        # AI voice cloning
    "num_speakers": "0",       # Auto-detect number of speakers
    "watermark": "false",
}

Usage Example

Terminal

# Translate a clip to Spanish
curl -X POST http://localhost:8000/api/translate \
  -H "Content-Type: application/json" \
  -H "X-ElevenLabs-Key: YOUR_API_KEY" \
  -d '{
    "job_id": "a7f3c2d1-...",
    "clip_index": 0,
    "target_language": "es",
    "source_language": "en"
  }'

Response:

{
  "success": true,
  "new_video_url": "/videos/a7f3c2d1-.../translated_es_clip_1.mp4"
}

Subtitle Generation for Dubbed Videos

Dubbed videos are automatically re-transcribed for accurate subtitles:

app.py

# Check if this is a dubbed video
is_dubbed = filename.startswith("translated_")

if is_dubbed:
    print(f"🎙️ Dubbed video detected, transcribing audio for subtitles...")
    def run_transcribe_srt():
        return generate_srt_from_video(input_path, srt_path)
    
    loop = asyncio.get_event_loop()
    success = await loop.run_in_executor(None, run_transcribe_srt)
else:
    # Use original transcript
    success = generate_srt(transcript, clip_data['start'], clip_data['end'], srt_path)

This ensures subtitles match the dubbed audio exactly, accounting for timing differences in the translated speech.

Timeout & Polling Configuration

# Default settings
max_wait_seconds = 600      # 10 minutes timeout
poll_interval = 5           # Check status every 5 seconds

# For longer videos, increase timeout:
translate_video(
    video_path=input_path,
    output_path=output_path,
    target_language="es",
    api_key=api_key,
    max_wait_seconds=1200,  # 20 minutes
    poll_interval=10        # Check every 10 seconds
)

Error Handling

try:
    translated_path = translate_video(
        video_path=input_path,
        output_path=output_path,
        target_language=target_language,
        api_key=api_key
    )
    print(f"✅ Translation complete: {translated_path}")
except Exception as e:
    if "timeout" in str(e).lower():
        print("⏱️ Dubbing timed out - try a shorter video or increase max_wait_seconds")
    elif "failed" in str(e).lower():
        print("❌ Dubbing failed - check ElevenLabs API status")
    else:
        print(f"❌ Translation error: {e}")

Pricing Considerations

ElevenLabs charges per character of audio:

Cost Estimate: A 60-second video with ~150 words costs approximately $0.30-0.50 to dub. Always check current ElevenLabs pricing.

Subtitles

Auto-generate subtitles for dubbed videos

Social Posting

Distribute dubbed clips to social media

Get Started

Core Features

Guides

Configuration

AI Voice Translation

Overview

Supported Languages

Translation Workflow

1. Create Dubbing Project

2. Poll for Completion

3. Download Dubbed Video

Main Translation Function

API Integration

Get Available Languages

Voice Cloning Feature

Usage Example

Subtitle Generation for Dubbed Videos

Timeout & Polling Configuration

Error Handling

Pricing Considerations

Subtitles

Social Posting

Build docs developers (and LLMs) love

Get Started

Core Features

Guides

Configuration

Documentation Index

​Overview

​Supported Languages

​Translation Workflow

​1. Create Dubbing Project

​2. Poll for Completion

​3. Download Dubbed Video

​Main Translation Function

​API Integration

​Get Available Languages

​Voice Cloning Feature

​Usage Example

​Subtitle Generation for Dubbed Videos

​Timeout & Polling Configuration

​Error Handling

​Pricing Considerations

​Related

Subtitles

Social Posting

Build docs developers (and LLMs) love

Overview

Supported Languages

Translation Workflow

1. Create Dubbing Project

2. Poll for Completion

3. Download Dubbed Video

Main Translation Function

API Integration

Get Available Languages

Voice Cloning Feature

Usage Example

Subtitle Generation for Dubbed Videos

Timeout & Polling Configuration

Error Handling

Pricing Considerations

Related