Cost Optimization

Understanding Tinbox’s cost model and optimization strategies can help you significantly reduce translation expenses without sacrificing quality.

Cost Model Overview

Tinbox calculates costs based on token usage across different model providers.

Model Pricing (as of September 2025)

Model	Input Cost (per 1K tokens)	Output Cost (per 1K tokens)	Notes
OpenAI GPT-5	$0.00125	$0.01	Fast and cost-effective
Anthropic Claude 4 Sonnet	$0.003	$0.015	Higher quality, higher cost
Google Gemini 2.5 Pro	$0.00125	$0.01	Competitive pricing
Ollama (Local)	$0	$0	Free but requires local hardware

Pricing information is defined in src/tinbox/core/cost.py:23-37. These are approximate costs and may vary by provider.

Token Estimation

Tinbox estimates tokens before translation using these approximations:

# From cost.py:40-78
def estimate_document_tokens(file_path: Path) -> int:
    # PDF: 500 tokens per page
    # DOCX: 1.3 tokens per word (rounded up)
    # TXT: 1 token per 4 characters (rounded up)

PDF Documents

Estimate: 500 tokens per pageA 100-page PDF ≈ 50,000 tokens

Reduce DPI for non-critical documents: --pdf-dpi 150 can reduce token usage by ~25%

Word Documents (.docx)

Estimate: 1.3 tokens per wordA 10,000-word document ≈ 13,000 tokens

Text Files (.txt)

Estimate: 1 token per 4 charactersA 40,000-character file ≈ 10,000 tokens

Cost Overhead Factors

Algorithm Overhead

Different algorithms have different cost implications:

# From cost.py:170-177
if algorithm == "context-aware":
    input_tokens = estimate_context_aware_tokens(estimated_tokens)
    output_tokens = estimated_tokens
else:
    # Page and sliding-window: 1:1 ratio
    input_tokens = estimated_tokens
    output_tokens = estimated_tokens

Context Overhead Breakdown:

Previous chunk context
Previous translation context
Next chunk preview
Translation instructions

Context-aware algorithm increases input tokens by ~4x. For a 10,000 token document:

Input tokens: ~40,000
Output tokens: ~10,000
Total cost impact: ~3x more expensive than page-by-page

Prompt Overhead

All translations include system prompts and instructions:

# From cost.py:180-181
prompt_factor = 0.03
input_tokens = math.ceil(input_tokens * (1 + prompt_factor))

Adds ~3% to input tokens for system prompts.

Glossary Overhead

When glossary is enabled, additional tokens are used:

# From cost.py:183-189
glossary_factor = 0.20
if use_glossary:
    glossary_overhead_tokens = math.ceil(
        (input_tokens + output_tokens) * glossary_factor
    )
    input_tokens += glossary_overhead_tokens

Adds ~20% to input tokens when glossary is enabled.

Glossary overhead is worthwhile for technical documents where terminology consistency is critical.

Cost Levels

Tinbox classifies estimated costs into levels:

# From cost.py:82-97
class CostLevel(str, Enum):
    LOW = "low"         # < $1
    MEDIUM = "medium"   # $1-$5
    HIGH = "high"       # $5-$20
    VERY_HIGH = "very_high"  # > $20

The CLI will warn you when costs exceed certain thresholds.

Optimization Strategies

1. Choose the Right Algorithm

Cost Priority
Quality Priority
Balanced

Use Page-by-Page for lowest cost:

# No context overhead = ~70% cheaper than context-aware
tinbox translate --to es --algorithm page --model openai:gpt-4o document.pdf

For PDFs, page-by-page is both the best quality AND cheapest option.

Use Context-Aware for best coherence:

# Worth the cost for continuous narratives
tinbox translate --to fr --algorithm context-aware --model openai:gpt-4o novel.txt

Accept the 4x input overhead when context preservation is essential.

Use Page-by-Page with Glossary:

# Low cost + terminology consistency
tinbox translate --to de --algorithm page --glossary --model openai:gpt-4o doc.pdf

Glossary adds 20% overhead but ensures consistency across pages.

2. Optimize Context Size

For context-aware algorithm, smaller chunks = more overhead:

# Larger chunks reduce overhead
tinbox translate --to es --context-size 3000 --model openai:gpt-4o document.txt  # Better

# Smaller chunks increase overhead
tinbox translate --to es --context-size 500 --model openai:gpt-4o document.txt   # More expensive

Optimal context sizes:

Short documents: 1500-2000 characters
Long documents: 2500-3000 characters
Technical docs: 2000-2500 characters

3. Use Local Models for Large Documents

For documents with 50K+ tokens, consider Ollama:

# Completely free, unlimited usage
ollama serve  # In another terminal
tinbox translate --to es --model ollama:llama3.1:8b large_document.txt

Local models from Ollama:

✅ Free: No API costs
✅ Unlimited: No rate limits
❌ Slower: ~20 tokens/sec vs ~30 for cloud
❌ Lower quality: May not match GPT-4o/Claude quality
❌ No vision: Cannot process PDFs as images

Tinbox will warn you about large documents:

# From cost.py:206-210
if estimated_total_tokens > 50000:
    warnings.append(
        f"Large document detected ({estimated_total_tokens:,} tokens). "
        "Consider using Ollama for no cost."
    )

4. Adjust PDF Quality

Reduce DPI for cost savings on PDFs:

# Default: 200 DPI (balanced)
tinbox translate --to es --pdf-dpi 200 --model openai:gpt-4o document.pdf

# Lower quality: 150 DPI (~25% fewer tokens)
tinbox translate --to es --pdf-dpi 150 --model openai:gpt-4o document.pdf

# High quality: 300 DPI (~50% more tokens)
tinbox translate --to es --pdf-dpi 300 --model openai:gpt-4o document.pdf

DPI Guidelines:

150: Simple text documents, cost-sensitive
200: Default, good balance
300: Complex layouts, diagrams, small fonts

5. Set Cost Limits

Prevent unexpected charges:

# Preview costs without translating
tinbox translate --to es --dry-run --model openai:gpt-4o document.pdf

# Set maximum cost threshold
tinbox translate --to es --max-cost 5.00 --model openai:gpt-4o document.pdf

Translation will stop if estimated or actual cost exceeds the limit:

# From algorithms.py:299-302
if config.max_cost and total_cost > config.max_cost:
    raise TranslationError(
        f"Translation cost of {total_cost:.2f} exceeded maximum cost of {config.max_cost:.2f}"
    )

6. Minimize Reasoning Effort

Higher reasoning effort increases cost and time unpredictably:

# Minimal: Fastest and cheapest (default)
tinbox translate --to es --reasoning-effort minimal --model openai:gpt-4o doc.txt

# Low: Slightly better quality, higher cost
tinbox translate --to es --reasoning-effort low --max-cost 10.00 --model openai:gpt-4o doc.txt

# High: Much higher cost, use only when needed
tinbox translate --to es --reasoning-effort high --max-cost 20.00 --model openai:gpt-4o doc.txt

From cost.py:232-236:

if reasoning_effort != "minimal":
    warnings.append(
        "Reasoning effort is '{reasoning_effort}', which means cost and time "
        "estimations are unreliable and will be much higher."
    )

Always use --max-cost with higher reasoning effort levels.

7. Use Checkpoints for Large Jobs

Avoid re-translating if interrupted:

# Save progress every 5 pages
tinbox translate --to es \
  --checkpoint-dir ./checkpoints \
  --checkpoint-frequency 5 \
  --model openai:gpt-4o \
  large_document.pdf

# Resume automatically if interrupted
tinbox translate --to es \
  --checkpoint-dir ./checkpoints \
  --model openai:gpt-4o \
  large_document.pdf  # Resumes from last checkpoint

Checkpoints save both progress and accumulated costs, so you only pay for new translations.

Cost Comparison Examples

Example 1: 100-Page PDF

Estimated tokens: 50,000 (500 per page)

Page-by-Page (Cheapest)
With Glossary
Ollama (Free)

tinbox translate --to es --algorithm page --model openai:gpt-4o doc.pdf

Input tokens: ~51,500 (50,000 + 3% prompt overhead)
Output tokens: ~50,000
Cost: ~$0.56
- Input: 51.5K × $0.00125 =$ 0.064
- Output: 50K × $0.01 =$ 0.50

tinbox translate --to es --algorithm page --glossary --model openai:gpt-4o doc.pdf

Input tokens: ~72,300 (50,000 + 3% prompt + 20% glossary)
Output tokens: ~50,000
Cost: ~$0.59
- Input: 72.3K × $0.00125 =$ 0.090
- Output: 50K × $0.01 =$ 0.50

tinbox translate --to es --model ollama:llama3.1:8b doc.pdf

⚠️ Not supported - Ollama models don’t support PDF visionYou would need to OCR first, which defeats the purpose.

Example 2: 50,000-Word Novel (Text File)

Estimated tokens: 65,000 (1.3 per word)

Page-by-Page
Context-Aware (Recommended)
Ollama (Free)

tinbox translate --to fr --algorithm page --model openai:gpt-4o novel.txt

Input tokens: ~66,950
Output tokens: ~65,000
Cost: ~$0.73

⚠️ Not recommended - No context between pages, poor coherence

tinbox translate --to fr --algorithm context-aware --model openai:gpt-4o novel.txt

Input tokens: ~267,800 (65,000 × 4 context + 3% prompt)
Output tokens: ~65,000
Cost: ~$0.98
- Input: 267.8K × $0.00125 =$ 0.335
- Output: 65K × $0.01 =$ 0.65

✅ Best quality - Worth the extra $0.25 for coherence

tinbox translate --to fr --model ollama:llama3.1:8b novel.txt

Cost: $0.00
Time: ~54 minutes (65,000 ÷ 20 tokens/sec)

✅ Best for budget - Free but slower

Cost Tracking During Translation

Tinbox displays real-time cost tracking in the progress bar:

Translating pages... ━━━━━━━━━━━━━━━━━━ 45/100 45% 0:01:23 $2.34

Use this to monitor spending and cancel if costs exceed expectations.

Best Practices

Always Dry Run First

tinbox translate --to es --dry-run --model openai:gpt-4o document.pdf

Preview estimated costs before committing.

Set a Cost Limit

tinbox translate --to es --max-cost 10.00 --model openai:gpt-4o document.pdf

Translation stops automatically if limit is exceeded.

Choose the Right Model

GPT-5 or Gemini: Best price/performance
Claude Sonnet: Higher quality, higher cost
Ollama: Free for text documents

Optimize for Document Type

PDFs: Use page-by-page with reasonable DPI
Text: Use context-aware with appropriate context size
Large docs: Consider Ollama or set lower DPI

Use Checkpoints

tinbox translate --to es --checkpoint-dir ./checkpoints --model openai:gpt-4o doc.pdf

Avoid re-paying for interrupted translations.

Algorithm Comparison

Understand cost implications of each algorithm

Output Formats

Choose the right output format for your needs

Get Started

Core Concepts

Guides

Advanced

Cost Optimization

Cost Model Overview

Model Pricing (as of September 2025)

Token Estimation

Cost Overhead Factors

Algorithm Overhead

Prompt Overhead

Glossary Overhead

Cost Levels

Optimization Strategies

1. Choose the Right Algorithm

2. Optimize Context Size

3. Use Local Models for Large Documents

4. Adjust PDF Quality

5. Set Cost Limits

6. Minimize Reasoning Effort

7. Use Checkpoints for Large Jobs

Cost Comparison Examples

Example 1: 100-Page PDF

Example 2: 50,000-Word Novel (Text File)

Cost Tracking During Translation

Best Practices

Algorithm Comparison

Output Formats

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Advanced

​Cost Model Overview

​Model Pricing (as of September 2025)

​Token Estimation

​Cost Overhead Factors

​Algorithm Overhead

​Prompt Overhead

​Glossary Overhead

​Cost Levels

​Optimization Strategies

​1. Choose the Right Algorithm

​2. Optimize Context Size

​3. Use Local Models for Large Documents

​4. Adjust PDF Quality

​5. Set Cost Limits

​6. Minimize Reasoning Effort

​7. Use Checkpoints for Large Jobs

​Cost Comparison Examples

​Example 1: 100-Page PDF

​Example 2: 50,000-Word Novel (Text File)

​Cost Tracking During Translation

​Best Practices

​Related Topics

Algorithm Comparison

Output Formats

Build docs developers (and LLMs) love

Cost Model Overview

Model Pricing (as of September 2025)

Token Estimation

Cost Overhead Factors

Algorithm Overhead

Prompt Overhead

Glossary Overhead

Cost Levels

Optimization Strategies

1. Choose the Right Algorithm

2. Optimize Context Size

3. Use Local Models for Large Documents

4. Adjust PDF Quality

5. Set Cost Limits

6. Minimize Reasoning Effort

7. Use Checkpoints for Large Jobs

Cost Comparison Examples

Example 1: 100-Page PDF

Example 2: 50,000-Word Novel (Text File)

Cost Tracking During Translation

Best Practices

Related Topics