Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/S1LV4/th0th/llms.txt

Use this file to discover all available pages before exploring further.

th0th supports multiple embedding providers for semantic code search. Choose between local (Ollama) or cloud providers (Mistral, OpenAI) based on your requirements.

Provider Comparison

ProviderModelDimensionsCostQualityLatency
Ollamanomic-embed-text, bge-m3768-1024FreeGoodLow (local)
Mistralmistral-embed1024$$GreatMedium
Mistralcodestral-embed1536-3072$$Great (code)Medium
OpenAItext-embedding-3-small1536$$GreatMedium

Quick Setup

100% Offline Setup

Ollama runs locally with no external dependencies or costs.
# Automated setup
./scripts/setup-local-first.sh

# Manual setup
ollama pull nomic-embed-text:latest
# or
ollama pull bge-m3
Configuration file (~/.config/th0th/config.json):
{
  "embedding": {
    "provider": "ollama",
    "model": "nomic-embed-text:latest",
    "baseURL": "http://localhost:11434",
    "dimensions": 768
  }
}
Environment variables (.env):
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_EMBEDDING_MODEL=bge-m3
OLLAMA_EMBEDDING_DIMENSIONS=1024
The bge-m3 model provides better quality (1024 dimensions) compared to nomic-embed-text (768 dimensions), but requires more compute resources.

Multi-Provider Fallback

th0th automatically falls back to alternative providers if the primary fails. Providers are prioritized by configuration:
export const embeddingProviders: Record<string, EmbeddingProviderConfig> = {
  ollama: {
    provider: "ollama",
    model: process.env.OLLAMA_EMBEDDING_MODEL || "bge-m3",
    baseURL: process.env.OLLAMA_BASE_URL || "http://localhost:11434",
    dimensions: 1024,
    priority: 1, // Highest priority (local, fast, free)
    timeout: 300000, // 5 minutes
    maxRetries: 2,
  },
  
  mistralText: {
    provider: "mistral",
    model: process.env.MISTRAL_TEXT_EMBEDDING_MODEL || "mistral-embed",
    apiKey: process.env.MISTRAL_API_KEY,
    dimensions: 1024,
    priority: 2, // Fallback if Ollama unavailable
    timeout: 60000,
    maxRetries: 3,
  },

  mistralCode: {
    provider: "mistral",
    model: process.env.MISTRAL_CODE_EMBEDDING_MODEL || "codestral-embed",
    apiKey: process.env.MISTRAL_API_KEY,
    dimensions: 1536,
    priority: 3,
    timeout: 60000,
    maxRetries: 3,
  },
};

Priority System

  1. Priority 1: Ollama (local, fast, free)
  2. Priority 2: Mistral Text (general purpose)
  3. Priority 3: Mistral Code (specialized for code)
  4. Priority 10: OpenAI, Google, Cohere (optional)
Lower priority numbers = higher priority. The system tries each provider in order until one succeeds.

Advanced Configuration

Remote Ollama (WSL/Docker)

Point to Ollama running on a different host:
# WSL pointing to Windows host
export OLLAMA_HOST=http://host.docker.internal:11434

# Remote server
export OLLAMA_HOST=http://192.168.1.100:11434
Config file:
{
  "embedding": {
    "provider": "ollama",
    "model": "bge-m3",
    "baseURL": "http://host.docker.internal:11434",
    "dimensions": 1024
  }
}

Custom Model Dimensions

Override automatic dimension detection:
npx @th0th-ai/mcp-client --config-set embedding.dimensions 1024

Retry Configuration

timeout
number
default:"60000"
Timeout for embedding requests in milliseconds. Ollama defaults to 300000 (5 minutes) for initial model loading.
maxRetries
number
default:"3"
Maximum retry attempts for failed embedding requests.
priority
number
default:"10"
Provider priority (1 = highest). Lower values are tried first.

Rate Limiting (Ollama)

Ollama requests are automatically rate-limited to prevent server overload:
// 50ms delay between requests
private static readonly OLLAMA_DELAY_MS = 50;
This ensures stable performance when processing large codebases.

Switching Providers

Using Config CLI

# Show current config
npx @th0th-ai/mcp-client --config-show

# Switch to Mistral
npx @th0th-ai/mcp-client --config-init --mistral YOUR_KEY

# Switch to OpenAI
npx @th0th-ai/mcp-client --config-init --openai YOUR_KEY

# Switch back to Ollama with different model
npx @th0th-ai/mcp-client --config-init --ollama-model nomic-embed-text

Manual Edit

Edit ~/.config/th0th/config.json directly:
# Find config path
npx @th0th-ai/mcp-client --config-path

# Edit with your preferred editor
vim ~/.config/th0th/config.json

Embedding Cache

th0th caches embeddings using SHA-256 content hashing to avoid redundant API calls:
# Cache location
EMBEDDING_CACHE_DB_PATH=./data/embedding-cache.db
Cache benefits:
  • Reduces API costs for cloud providers
  • Faster repeated searches on same content
  • Persists across restarts

Performance Metrics

Cache hit rate typically reaches 80-90% after initial indexing, significantly reducing embedding API usage.

Troubleshooting

Ollama Connection Issues

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama
ollama serve

# Check logs
journalctl -u ollama -f

Model Not Found

# List available models
ollama list

# Pull missing model
ollama pull bge-m3

API Key Errors (Mistral/OpenAI)

Ensure your API key is set in either:
  • Config file: ~/.config/th0th/config.json
  • Environment variable: MISTRAL_API_KEY or OPENAI_API_KEY
# Verify API key is loaded
npx @th0th-ai/mcp-client --config-show

Dimension Mismatch

If you change embedding models, you must re-index:
# Clear existing embeddings
rm -rf ./data/chroma
rm ./data/embedding-cache.db

# Re-index with new model
curl -X POST http://localhost:3333/api/v1/project/index \
  -H "Content-Type: application/json" \
  -d '{"projectPath": "/path/to/project", "projectId": "my-project"}'

Next Steps

Advanced Configuration

Learn about cache settings, compression strategies, and logging configuration.

Build docs developers (and LLMs) love