Use this file to discover all available pages before exploring further.
Tinbox provides comprehensive cost estimation and tracking to help you manage translation expenses. Get upfront estimates before starting and monitor real-time costs during translation.
Tinbox uses different estimation methods for each file type:From cost.py:40-78:
def estimate_document_tokens(file_path: Path) -> int: """Estimate the number of tokens in a document.""" file_type = FileType(file_path.suffix.lstrip(".").lower()) if file_type == FileType.PDF: # Rough estimate: 500 tokens per page import pypdf with open(file_path, "rb") as f: pdf = pypdf.PdfReader(f) return len(pdf.pages) * 500 elif file_type == FileType.DOCX: # Rough estimate: 1.3 tokens per word, rounded up from docx import Document doc = Document(file_path) word_count = sum(len(p.text.split()) for p in doc.paragraphs) return int(word_count * 1.3 + 0.999) else: # TXT # Rough estimate: 1 token per 4 characters, rounded up text = file_path.read_text() return -(-len(text) // 4) # Ceiling division
The context-aware algorithm uses significantly more input tokens due to context sharing:From cost.py:125-142:
def estimate_context_aware_tokens( estimated_tokens: int, context_multiplier: float = 4) -> int: """Estimate input tokens for context-aware translation. Context-aware algorithm uses more input tokens due to: - Previous chunk context - Previous translation context - Translation instructions """ return math.ceil(estimated_tokens * context_multiplier)
Context-aware algorithm increases input tokens by 4x. This improves quality but significantly increases cost. Always check the estimate before proceeding.
Tinbox generates warnings for cost-related issues:From cost.py:203-236:
warnings = []if model != ModelType.OLLAMA: # Large document warning if estimated_total_tokens > 50000: warnings.append( f"Large document detected ({estimated_total_tokens:,} tokens). " "Consider using Ollama for no cost." ) # Context-aware overhead warning if algorithm == "context-aware": context_overhead = input_tokens - estimated_tokens warnings.append( f"Context-aware algorithm uses additional input tokens for context " f"(+{context_overhead:,} tokens, ~{context_overhead * 100 // estimated_tokens}% overhead). " f"This improves translation quality but increases cost." ) # Glossary overhead warning if use_glossary: warnings.append( f"Glossary enabled adds input token overhead (~20% of total tokens)." ) # Max cost exceeded warning if max_cost and estimated_cost > max_cost: warnings.append( f"Estimated cost (${estimated_cost:.2f}) exceeds maximum " f"threshold (${max_cost:.2f})" )# Reasoning effort warning (applies to all models)if reasoning_effort != "minimal": warnings.append( f"Reasoning effort is '{reasoning_effort}', which means cost and time estimations are unreliable and will be much higher. " f"Make sure to set a --max-cost and keep an eye on the live cost and time predictions in the progress bar." )
Set a maximum cost threshold to prevent runaway expenses:
tinbox translate --to de --max-cost 10.00 document.pdf
From the translation algorithms:
if config.max_cost and total_cost > config.max_cost: raise TranslationError( f"Translation cost of {total_cost:.2f} exceeded maximum cost of {config.max_cost:.2f}" )
Translation stops immediately when max cost is exceeded. Use checkpoints to resume later with adjusted limits.
tinbox translate --to es --model openai --reasoning-effort high \ --max-cost 50.00 manual.pdf# Warning: Cost estimation unreliable with high reasoning# Actual cost may be 10-20x higher than estimate# Always set --max-cost with reasoning models!
Reasoning effort makes cost estimates unreliable. Always monitor the real-time progress bar and set a safety limit with --max-cost.