Default Embedding Function
If you don’t specify an embedding function, Chroma uses the default ONNX MiniLM-L6-v2 model:Built-in Embedding Functions
OpenAI Embeddings
Use OpenAI’s embedding models:text-embedding-ada-002- Most capable, 1536 dimensionstext-embedding-3-small- Smaller, fastertext-embedding-3-large- Highest quality
Cohere Embeddings
Use Cohere’s embedding models:Hugging Face Embeddings
Use any Sentence Transformers model:all-MiniLM-L6-v2- Fast, 384 dimensionsall-mpnet-base-v2- High quality, 768 dimensionsparaphrase-multilingual-MiniLM-L12-v2- Multilingual support
Instructor Embeddings
Task-specific embeddings with instructions:Google Gemini Embeddings
Amazon Bedrock Embeddings
Custom Embedding Functions
Create your own embedding function by implementing theEmbeddingFunction protocol:
Custom Function Requirements
Your embedding function must:- Implement
__call__- TakesDocumentsand returnsList[List[float]] - Implement
name()- Returns a unique identifier - Implement
build_from_config()- Recreates function from config - Implement
get_config()- Returns serializable configuration - Return consistent dimensions - All embeddings must have same length
Advanced Custom Function
Embedding Function Configuration
Distance Metrics
Different embedding functions work best with different distance metrics:cosine- Cosine similarity (default, recommended for most models)l2- Euclidean distanceip- Inner product