Tinbox provides three translation algorithms, each optimized for different document types and translation quality requirements. Choose the right algorithm based on your document structure and desired output quality.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/strickvl/tinbox/llms.txt
Use this file to discover all available pages before exploring further.
Algorithm Overview
Page
Translates documents page by page independently. Fast and cost-effective.
Sliding Window
Uses overlapping windows for consistent terminology. Best for continuous text.
Context-Aware
Maintains context across chunks with smart splitting. Highest quality output.
Page-by-Page Algorithm
The page-by-page algorithm translates each page independently without maintaining context between pages. This is the fastest and most cost-effective approach.How It Works
- Each page is translated as a separate, independent request
- No context from previous pages is shared
- Failed pages are tracked and marked in the output
- Supports checkpoint/resume for long documents
Best For
- PDF documents with distinct pages
- Documents where pages are self-contained
- When speed and cost are priorities
- Large documents where context isn’t critical
Code Example
Fromalgorithms.py:147-354:
CLI Usage
Failed pages are marked with
[TRANSLATION_FAILED] placeholders in the output, making failures visible while preserving the document structure.Sliding Window Algorithm
The sliding window algorithm combines all pages into a single text, then creates overlapping windows for translation. This ensures consistent terminology across window boundaries.How It Works
- All pages are joined into a single continuous text
- Text is split into overlapping windows of configurable size
- Each window is translated with a specified overlap
- Translated windows are merged by detecting and removing duplicate overlap regions
Configuration Options
--window-size: Size of each window in characters (default: 2000)--overlap-size: Overlap between windows in characters (default: 200)
Best For
- Continuous text documents (novels, articles, essays)
- DOCX and TXT files without page breaks
- When consistent terminology is important
- Documents with flowing narrative
Code Example
Fromalgorithms.py:520-611:
CLI Usage
Context-Aware Algorithm
The context-aware algorithm provides the highest quality translations by maintaining context from previous chunks and using smart text splitting at natural boundaries.How It Works
- Text is split at natural boundaries (paragraphs, sentences, clauses)
- Each chunk is translated with context from the previous chunk
- Context includes both the original text and its translation
- The next chunk preview is provided for better flow
- Translated chunks are directly concatenated (no merging needed)
Smart Text Splitting
The algorithm splits text at natural boundaries in priority order:- Custom split token (if provided) - ignores target size
- Paragraph breaks (
\n\n) - Sentence endings (
.!?followed by space) - Line breaks (
\n) - Clause boundaries (
;:,followed by space) - Word boundaries (whitespace)
- Hard split at target size (fallback)
algorithms.py:614-717:
Context Information
Fromalgorithms.py:720-759:
Configuration Options
--context-size: Target chunk size in characters (default: 2000)--custom-split-token: Custom token to split on (ignores context-size)
Best For
- High-quality literary translations
- Technical documentation requiring consistent terminology
- Documents with complex narrative structure
- When translation quality is the top priority
CLI Usage
Algorithm Comparison
| Feature | Page-by-Page | Sliding Window | Context-Aware |
|---|---|---|---|
| Context | None | Overlap only | Full context |
| Speed | Fastest | Medium | Slowest |
| Cost | Lowest | Medium | Highest |
| Quality | Good | Better | Best |
| Text Splitting | By page | Fixed windows | Smart boundaries |
| PDF Support | ✅ Yes | ❌ No | ❌ No |
| Image Support | ✅ Yes | ❌ No | ❌ No |
| Best For | PDFs, speed | Continuous text | Quality, technical docs |
Checkpoint Support
All three algorithms support checkpointing for resuming interrupted translations:Checkpoints save translation state every N pages/chunks. If translation is interrupted, Tinbox automatically resumes from the last checkpoint.
Choosing the Right Algorithm
When to use Page-by-Page
When to use Page-by-Page
- You’re translating PDF documents
- Speed and cost are your primary concerns
- Pages are relatively self-contained
- You don’t need perfect terminology consistency across pages
When to use Sliding Window
When to use Sliding Window
- You’re translating continuous text (TXT, DOCX)
- You need consistent terminology
- The document has a flowing narrative
- You want a balance of quality and cost
When to use Context-Aware
When to use Context-Aware
- Translation quality is critical
- You need consistent terminology and style
- The document has complex structure
- You’re translating technical or literary content
- Cost is less of a concern