Skip to main content

AI Translation

Doom provides intelligent AI-powered translation for documentation, automatically translating content while preserving MDX formatting, code blocks, technical terms, and links. The translation system uses Azure OpenAI for high-quality, context-aware translations.

Quick Start

Translate documentation from English to Chinese:
doom translate \
  --source en \
  --target zh \
  --glob "guide/**" "api/**"
This translates all files matching the glob patterns from docs/en/ to docs/zh/.

How It Works

The translation process:
  1. Scans source files matching glob patterns
  2. Checks SHA hashes to skip unchanged files
  3. Extracts frontmatter and content separately
  4. Preserves links, code blocks, and MDX components
  5. Translates using Azure OpenAI with custom prompts
  6. Restores formatting and special syntax
  7. Writes translated files with updated SHA
Translation uses intelligent chunking for large files and includes automatic retry logic for rate limits.

Configuration

Environment Variables

Configure Azure OpenAI credentials:
# Required
export AZURE_OPENAI_API_KEY="your-api-key"

# Optional (with defaults)
export AZURE_OPENAI_ENDPOINT="https://azure-ai-api-gateway.alauda.cn"
export AZURE_OPENAI_MODEL="gpt-4.1-mini"
export OPENAI_API_VERSION="2025-04-01-preview"

Config File

Customize translation behavior in doom.config.yml:
translate:
  systemPrompt: |
    You are a technical documentation expert.
    Translate accurately while maintaining technical precision.
  userPrompt: |
    Additional instructions for this project:
    - Use formal tone
    - Preserve brand names

Command Usage

doom translate [root] [options]

Required Options

--glob, -g
string[]
required
Glob patterns for files to translate (relative to source language directory)
doom translate --glob "**/*.md" "**/*.mdx"

Optional Parameters

root
string
Root directory of the documentation (default: current directory)
--source, -s
string
default:"en"
Source language code. Supported: en, zh, ru, ja, ko, es, fr, de, pt, it
--target, -t
string
default:"zh"
Target language code. Same supported languages as source.
--copy, -C
boolean
default:false
Copy relative asset files to target directory instead of following links
--force, -f
boolean
default:false
Force re-translation even if source SHA matches (ignores hash equality check)

Example Commands

# Translate guide and API docs
doom translate -s en -t zh -g "guide/**" "api/**"

Supported Languages

The translation system supports these language codes:

English

en

Chinese

zh

Russian

ru

Japanese

ja

Korean

ko

Spanish

es

French

fr

German

de

Portuguese

pt

Italian

it

Terminology Management

Define custom terminology translations for consistent technical terms across languages.

Supported Term Languages

Terminology management is available for: en, zh, ru

Terms File

Create a terms file (format depends on your implementation):
// Example terminology entry
{
  en: "deployment",
  zh: "部署",
  ru: "развертывание"
}
The translation system automatically:
  1. Detects relevant terms in source content
  2. Provides them to the AI as translation glossary
  3. Ensures consistent translation across documents

Frontmatter Handling

Automatic Translation

Certain frontmatter fields are automatically translated:
---
title: Getting Started        # Translated
description: Quick intro      # Translated
author: John Doe             # Preserved
date: 2024-01-01             # Preserved
---

Custom Prompts

Add page-specific translation instructions:
---
title: Architecture Overview
i18n:
  additionalPrompts: |
    - Use formal technical language
    - Translate "container" as "容器"
    - Keep all CLI commands in English
  disableAutoTranslation: false
---
i18n.additionalPrompts
string
Custom instructions for translating this specific page
i18n.disableAutoTranslation
boolean
default:false
Skip automatic translation for this page

Source SHA

The system tracks content changes using SHA hashes:
---
title: Getting Started
sourceSHA: a3f8e9d2c1b7f6e5d4c3b2a1f0e9d8c7
---
This prevents unnecessary re-translation when source hasn’t changed.

Content Preservation

The translation system intelligently preserves: All link formats remain unchanged:
[Link text](URL)              # URL preserved, text translated
[Reference][ref]              # Both preserved
[ref]: URL                    # Both preserved
![Alt text](image.png)        # Path preserved, alt translated
[Anchor](#section)            # Anchor preserved

Code Blocks

Code and syntax are never translated:
Text is translated.

````bash
# This command is preserved exactly
doom build
````

More translated text.

MDX Components

<Note>
  This content is translated
</Note>

<CodeGroup>
````bash
Preserved code
````
</CodeGroup>
Component names and props remain unchanged, content is translated.

Special Comments

Preserved comments:
{/* release-notes-for-bugs */}
<!-- release-notes-for-bugs -->
Removed comments:
{/* reference-start */}
{/* reference-end */}
<!-- reference-start -->
<!-- reference-end -->

Heading Anchors

Custom heading IDs are preserved:
## Getting Started {#getting-started}
The anchor {#getting-started} remains unchanged in translation.

Technical Terms

Preserved by default:
  • Product names: Kubernetes, Docker, React, etc.
  • Technical terms: API, REST, JSON, YAML, CLI, etc.
  • Code identifiers: Variable names, function names
  • URLs and email addresses

Chunking for Large Files

Files larger than 60KB are automatically split into chunks:
// Automatic chunking process:
1. Split content by lines
2. Create chunks under 60KB each
3. Translate each chunk separately
4. Combine translated chunks
5. Maintain context between chunks
Chunking happens transparently. The system informs the AI that content is part of a larger document to maintain consistency.

Rate Limiting

Built-in rate limiting prevents API throttling:
{
  interval: 60_000,  // 1 minute
  rate: 50,          // 50 requests
  concurrency: 10    // 10 concurrent requests
}
Automatic retry with exponential backoff on rate limit errors:
  • Initial retry: 60 seconds
  • Max retries: 15
  • Increasing delay on subsequent retries

Special Modes

Full Sync Mode

Using glob pattern * enables full sync:
doom translate -s en -t zh -g "*"
This mode:
  • Translates all source files
  • Removes target files without matching source
  • Ensures target directory exactly matches source structure
  • Shows warning about file removal
Full sync mode will delete unmatched target files. Use with caution!

Copy-Only Directories

Certain directories are copied instead of translated:
const COPY_ONLY_DIRECTORIES = [
  'apis/advanced_apis/**',
  'apis/kubernetes_apis/**'
]
Files in these directories:
  • Update sourceSHA in frontmatter
  • Copy content exactly without translation
  • Maintain structure and formatting

Advanced Usage

Custom System Prompt

The default prompt is highly optimized, but you can customize:
translate:
  systemPrompt: |
    You are a professional technical writer specializing in <%= targetLang %>.
    
    Requirements:
    - Maintain technical accuracy
    - Use <%= targetLang %> conventions
    - Preserve all MDX formatting
    - Keep code blocks unchanged
    
    <% if (terms) { %>
    Terminology:
    <%- terms %>
    <% } %>
Available template variables:
  • sourceLang - Source language name
  • targetLang - Target language name
  • terms - Relevant terminology list
  • titleTranslationPrompt - Pre-defined title translations
  • userPrompt - Custom user instructions
  • additionalPrompts - Page-specific prompts
  • isChunk - Whether this is part of a larger document

Title Translation Map

Pre-define translations for common titles:
// From source code
const TITLE_TRANSLATION_MAP = [
  {
    en: "Getting Started",
    zh: "快速开始",
    ru: "Начало работы"
  },
  {
    en: "Installation",
    zh: "安装",
    ru: "Установка"
  }
]
These are automatically used when detected in first-level headings.

Internal Routes

Exclude internal/draft pages from translation:
internalRoutes:
  - '*/internal/**'
  - '*/drafts/**'
  - '*/wip/**'
Files matching these patterns are automatically skipped.

Asset Handling

By default, asset paths are preserved:
<!-- Source: docs/en/guide.md -->
![Architecture](../images/arch.png)

<!-- Target: docs/zh/guide.md -->
![架构](../images/arch.png)
Both language versions reference the same image.

Copy Mode

With --copy, assets are duplicated:
doom translate -s en -t zh -g "**" --copy
docs/
├── en/
│   ├── images/arch.png
│   └── guide.md
└── zh/
    ├── images/arch.png    # Copied
    └── guide.md
This creates language-specific asset copies.

Translation Quality

The system ensures quality through:
1

Context Awareness

AI receives full document context and terminology
2

Format Preservation

MDX structure, code blocks, and components remain intact
3

Terminology Consistency

Technical terms translated consistently across documents
4

Review Capability

SHA tracking allows reviewing only changed content

Troubleshooting

API Authentication Error

Verify your Azure OpenAI credentials:
# Check environment variables
echo $AZURE_OPENAI_API_KEY
echo $AZURE_OPENAI_ENDPOINT

# Test connection
curl -H "api-key: $AZURE_OPENAI_API_KEY" \
     "$AZURE_OPENAI_ENDPOINT/openai/deployments?api-version=2023-05-15"

Rate Limit Errors

The system automatically retries, but if errors persist:
  1. Reduce concurrency (modify source if needed)
  2. Translate in smaller batches:
    doom translate -g "guide/part1/**"
    doom translate -g "guide/part2/**"
    
  3. Wait for rate limit reset
  4. Check your Azure OpenAI quota

Missing Translations

If files aren’t being translated:
  1. Verify glob patterns match files:
    ls docs/en/guide/**/*.md
    
  2. Check for disableAutoTranslation: true in frontmatter
  3. Ensure files aren’t in internalRoutes
  4. Look for errors in console output

Formatting Issues

If translated files have formatting problems:
  1. Check source file for valid MDX syntax
  2. Verify code blocks are properly closed
  3. Ensure MDX components are valid
  4. Use --force to re-translate after fixing source
  5. Review the systemPrompt for formatting instructions

Best Practices

Start Small

Test translation on a few files first to verify quality and settings.

Use Terminology

Define key terms upfront for consistency across all translations.

Review Translations

Always review AI translations, especially for technical content.

Leverage SHA Tracking

Use SHA to only review translations of changed content.

Batch by Section

Translate related content together for better context.

Version Control

Commit translations separately from source changes for easier review.

Integration Examples

CI/CD Pipeline

# .github/workflows/translate.yml
name: Auto-translate Documentation

on:
  push:
    branches: [main]
    paths:
      - 'docs/en/**'

jobs:
  translate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '20'
      
      - name: Install dependencies
        run: yarn install
      
      - name: Translate to Chinese
        env:
          AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
        run: |
          yarn doom translate -s en -t zh -g "**"
      
      - name: Translate to Russian
        env:
          AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
        run: |
          yarn doom translate -s en -t ru -g "**"
      
      - name: Create Pull Request
        uses: peter-evans/create-pull-request@v5
        with:
          commit-message: "docs: auto-translate documentation"
          title: "Auto-translated documentation"
          body: "Automated translation from English source"
          branch: auto-translate

NPM Scripts

{
  "scripts": {
    "translate:zh": "doom translate -s en -t zh -g '**'",
    "translate:ru": "doom translate -s en -t ru -g '**'",
    "translate:all": "npm run translate:zh && npm run translate:ru",
    "translate:force": "doom translate -s en -t zh -g '**' --force"
  }
}

Auto Sidebar

Generate multilingual navigation automatically

PDF Export

Export translated documentation to PDF

Build docs developers (and LLMs) love