Tinbox provides a flexible translation system with multiple algorithms for different use cases. The system uses a protocol-based interface for translator implementations.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/strickvl/tinbox/llms.txt
Use this file to discover all available pages before exploring further.
ModelInterface
Protocol defining the interface for LLM translation models.translate
Translate content according to a translation request.Translation request containing source/target languages, content, and configuration.
Translation response with translated text, token usage, cost, and timing information.
validate_model
Validate that the model is properly configured and accessible.True if the model is available and can be used for translation.TranslationRequest
Configuration for a single translation request.Source language code (e.g.,
"en", "fr", "ja").Target language code (e.g.,
"de", "es", "zh").Content to translate:
strfor text contentbytesfor image content (PNG format for scanned documents)
MIME type of the content. Must match pattern
^(text|image)/.+$."text/plain"- Plain text content"image/png"- Image content (scanned PDFs)
Model provider to use. Options:
ModelType.OPENAI- OpenAI modelsModelType.ANTHROPIC- Anthropic Claude modelsModelType.GEMINI- Google Gemini modelsModelType.OLLAMA- Local Ollama models
Optional context information to improve translation quality and consistency.Context-aware algorithm provides:
[PREVIOUS_CHUNK]tags with previous content[PREVIOUS_CHUNK_TRANSLATION]tags with previous translation[NEXT_CHUNK]tags with upcoming content
Additional model-specific parameters. Common parameters:
model_name: Specific model to use (e.g.,"gpt-4o","claude-3-sonnet")temperature: Sampling temperature (if supported)max_tokens: Maximum output tokens (if supported)
Optional glossary for consistent term translations. The model will use these terms when translating.
Model reasoning effort level:
"minimal"- Fast, cost-effective"low"- Slight improvement, moderate cost increase"medium"- Better quality, higher cost"high"- Best quality, significantly higher cost
TranslationRequest is immutable (frozen=True).TranslationResponse
Response from a translation request or algorithm.The translated text. For page-by-page algorithm with failed pages, contains placeholders:
Total number of tokens used (input + output). Must be >= 0.
Total cost in USD. Must be >= 0.0.
Time taken in seconds. Must be >= 0.0.
New glossary entries discovered during translation (when glossary is enabled).Each entry contains:
term: Term in source languagetranslation: Translation in target language
List of page numbers that failed to translate (page-by-page algorithm only).Page numbers are 1-indexed.
Mapping from page number to error message for failed pages.
Non-fatal warnings during translation.Common warnings:
- Incomplete translation due to failed pages
- Cost approaching threshold
- Algorithm-specific issues
TranslationResponse is immutable (frozen=True).Translation Algorithms
Tinbox provides three translation algorithms, each optimized for different scenarios.Page-by-Page
Translates each page independently without context.- Fastest algorithm
- No context between pages
- Good for documents with independent sections
- Best for simple documents or when speed is priority
- Supports resume from checkpoint
- Can continue despite individual page failures
- Simple documents
- Presentations with independent slides
- Documents where each page is self-contained
- Quick translations where context isn’t critical
Sliding Window
Processes text using overlapping windows for continuity.- Good balance between speed and quality
- Overlapping windows maintain some continuity
- Not suitable for image content (text only)
- Windows are merged intelligently after translation
- Supports resume from checkpoint
- Long text documents
- Content requiring some continuity
- When context-aware overhead is too high
- Technical documentation with cross-references
window_size: Size of each window (default: 2000 characters)overlap_size: Overlap between windows (default: 200 characters)
Context-Aware
Splits text at natural boundaries with full context from adjacent chunks.- Highest quality translations
- Splits text at natural boundaries (paragraphs, sentences, etc.)
- Provides previous/next chunk context for each translation
- Higher input token usage (3-4x multiplier due to context)
- Not suitable for image content (text only)
- Supports resume from checkpoint
- Literary works and books
- Content requiring high consistency
- Documents with narrative flow
- Technical manuals with interconnected sections
context_size: Target chunk size in characters (default: 2000)custom_split_token: Custom token to split on, ignoringcontext_size
TranslationError
Exception raised when translation fails.- API authentication failures
- Network errors
- Rate limiting
- Invalid model configuration
- Cost exceeding
max_costthreshold - Unknown algorithm specified
- Model-specific errors (context length exceeded, etc.)