Chroma uses Sentence Transformers as its default embedding function, providing high-quality embeddings out of the box with no additional configuration.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/chroma-core/chroma/llms.txt
Use this file to discover all available pages before exploring further.
Default model
Chroma uses the all-MiniLM-L6-v2 model by default:- Dimensions: 384
- Speed: Very fast
- Quality: Excellent for most use cases
- Size: ~80MB
- Language: English (trained on English data)
How it works
When you create a collection without specifying an embedding function:Automatic embedding
When you add documents, Chroma automatically generates embeddings using the model
Explicit usage
ONNX Runtime
Chroma uses ONNX Runtime for fast, efficient inference:Benefits of ONNX
Fast inference
Optimized for CPU inference without requiring heavy ML frameworks
Small footprint
Minimal dependencies compared to full PyTorch or TensorFlow
Cross-platform
Works consistently across Windows, macOS, and Linux
No GPU required
Efficient CPU inference, no need for GPU
Model characteristics
Performance
Quality metrics
The all-MiniLM-L6-v2 model achieves:- Semantic Textual Similarity: 82.41% correlation
- Semantic Search: 58.04 mean average precision
- Paraphrase Mining: 70.93 F1 score
Customization
While Sentence Transformers is the default, you can easily switch to other embedding functions:- OpenAI
- Cohere
- Custom Sentence Transformer
Model caching
Cache location
Pre-downloading
When to use
✅ Good for
✅ Good for
- Prototyping: Get started quickly with no API keys
- English content: Trained on English data
- Privacy: All processing happens locally
- Cost-sensitive: No API costs
- Offline use: Works without internet (after initial download)
⚠️ Consider alternatives if
⚠️ Consider alternatives if
- You need multilingual support → Use Cohere or multilingual Sentence Transformers
- You need highest quality → Use OpenAI text-embedding-3-large
- You’re processing very large volumes → Consider hosted embedding APIs
- You need domain-specific embeddings → Fine-tune or use specialized models
Troubleshooting
Model download fails
Model download fails
Slow first run
Slow first run
The first embedding operation downloads the model (~80MB). Subsequent operations are fast. Consider pre-downloading in your deployment process.
Out of memory errors
Out of memory errors
Process documents in smaller batches:
Resources
The default Sentence Transformer model provides excellent performance for most use cases. You only need to switch to a different embedding function if you have specific requirements like multilingual support or domain-specific embeddings.