Embedding Functions

Chroma uses embedding functions to convert your documents and queries into vector representations. You can use built-in embedding functions or create your own.

Default Embedding Function

If you don’t specify an embedding function, Chroma uses the default ONNX MiniLM-L6-v2 model:

import chromadb

client = chromadb.Client()

# Uses default embedding function
collection = client.create_collection(name="my_collection")

# Add documents - embeddings are generated automatically
collection.add(
    documents=["This is a document", "This is another document"],
    ids=["id1", "id2"]
)

The default embedding function creates 384-dimensional vectors using the all-MiniLM-L6-v2 model.

Built-in Embedding Functions

OpenAI Embeddings

Use OpenAI’s embedding models:

from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction

openai_ef = OpenAIEmbeddingFunction(
    api_key="your-api-key",
    model_name="text-embedding-ada-002"
)

collection = client.create_collection(
    name="openai_collection",
    embedding_function=openai_ef
)

Available OpenAI models:

text-embedding-ada-002 - Most capable, 1536 dimensions
text-embedding-3-small - Smaller, faster
text-embedding-3-large - Highest quality

Cohere Embeddings

Use Cohere’s embedding models:

from chromadb.utils.embedding_functions import CohereEmbeddingFunction

cohere_ef = CohereEmbeddingFunction(
    api_key="your-api-key",
    model_name="embed-english-v3.0"
)

collection = client.create_collection(
    name="cohere_collection",
    embedding_function=cohere_ef
)

Hugging Face Embeddings

Use any Sentence Transformers model:

from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction

ef = SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)

collection = client.create_collection(
    name="hf_collection",
    embedding_function=ef
)

Popular models:

all-MiniLM-L6-v2 - Fast, 384 dimensions
all-mpnet-base-v2 - High quality, 768 dimensions
paraphrase-multilingual-MiniLM-L12-v2 - Multilingual support

Instructor Embeddings

Task-specific embeddings with instructions:

from chromadb.utils.embedding_functions import InstructorEmbeddingFunction

instructor_ef = InstructorEmbeddingFunction(
    model_name="hkunlp/instructor-base",
    instruction="Represent the document for retrieval: ",
    device="cuda"  # or "cpu"
)

collection = client.create_collection(
    name="instructor_collection",
    embedding_function=instructor_ef
)

Google Gemini Embeddings

from chromadb.utils.embedding_functions import GoogleGenerativeAiEmbeddingFunction

gemini_ef = GoogleGenerativeAiEmbeddingFunction(
    api_key="your-api-key",
    model_name="models/embedding-001"
)

Amazon Bedrock Embeddings

from chromadb.utils.embedding_functions import AmazonBedrockEmbeddingFunction

bedrock_ef = AmazonBedrockEmbeddingFunction(
    aws_access_key_id="your-access-key",
    aws_secret_access_key="your-secret-key",
    aws_region_name="us-east-1",
    model_id="amazon.titan-embed-text-v1"
)

Custom Embedding Functions

Create your own embedding function by implementing the EmbeddingFunction protocol:

from chromadb.api.types import EmbeddingFunction, Documents
import numpy as np
from typing import List

class MyEmbeddingFunction(EmbeddingFunction[Documents]):
    def __call__(self, input: Documents) -> List[List[float]]:
        # Your embedding logic here
        embeddings = []
        for doc in input:
            # Example: simple character-based embedding
            embedding = [float(ord(c)) for c in doc[:10].ljust(10, ' ')]
            embeddings.append(embedding)
        return embeddings
    
    @staticmethod
    def name() -> str:
        return "my_custom_function"
    
    @staticmethod
    def build_from_config(config: dict) -> "MyEmbeddingFunction":
        return MyEmbeddingFunction()
    
    def get_config(self) -> dict:
        return {}

# Use your custom function
my_ef = MyEmbeddingFunction()
collection = client.create_collection(
    name="custom_collection",
    embedding_function=my_ef
)

Custom Function Requirements

Your embedding function must:

Implement __call__ - Takes Documents and returns List[List[float]]
Implement name() - Returns a unique identifier
Implement build_from_config() - Recreates function from config
Implement get_config() - Returns serializable configuration
Return consistent dimensions - All embeddings must have same length

Advanced Custom Function

from chromadb.api.types import EmbeddingFunction, Documents
import requests

class RemoteEmbeddingFunction(EmbeddingFunction[Documents]):
    def __init__(self, api_url: str, api_key: str):
        self._api_url = api_url
        self._api_key = api_key
    
    def __call__(self, input: Documents) -> List[List[float]]:
        response = requests.post(
            self._api_url,
            headers={"Authorization": f"Bearer {self._api_key}"},
            json={"texts": input}
        )
        return response.json()["embeddings"]
    
    @staticmethod
    def name() -> str:
        return "remote_embedding_function"
    
    @staticmethod
    def build_from_config(config: dict) -> "RemoteEmbeddingFunction":
        return RemoteEmbeddingFunction(
            api_url=config["api_url"],
            api_key=config["api_key"]
        )
    
    def get_config(self) -> dict:
        return {
            "api_url": self._api_url,
            "api_key": self._api_key
        }

Embedding Function Configuration

Distance Metrics

Different embedding functions work best with different distance metrics:

collection = client.create_collection(
    name="my_collection",
    embedding_function=openai_ef,
    metadata={"hnsw:space": "cosine"}  # or "l2" or "ip"
)

Distance metrics:

cosine - Cosine similarity (default, recommended for most models)
l2 - Euclidean distance
ip - Inner product

Query vs. Document Embeddings

Some embedding functions support different embeddings for queries vs documents:

class AsymmetricEmbeddingFunction(EmbeddingFunction[Documents]):
    def __call__(self, input: Documents) -> List[List[float]]:
        # Embed documents
        return self._embed_documents(input)
    
    def embed_query(self, input: Documents) -> List[List[float]]:
        # Different embedding for queries
        return self._embed_query(input)

Working with Multiple Modalities

Multimodal Embeddings

For images and text:

from chromadb.utils.embedding_functions import OpenCLIPEmbeddingFunction
from chromadb.utils.data_loaders import ImageLoader

clip_ef = OpenCLIPEmbeddingFunction()
image_loader = ImageLoader()

collection = client.create_collection(
    name="multimodal_collection",
    embedding_function=clip_ef,
    data_loader=image_loader
)

# Add images by URI
collection.add(
    ids=["img1"],
    uris=["path/to/image.jpg"]
)

# Query with text
results = collection.query(
    query_texts=["a photo of a cat"],
    n_results=5
)

Performance Considerations

Batch Processing

Embedding functions process documents in batches:

# Efficient: batch processing
collection.add(
    documents=[...],  # Large list processed in batches
    ids=[...]
)

# Inefficient: one at a time
for doc, id in zip(documents, ids):
    collection.add(documents=[doc], ids=[id])

Caching

Implement caching in custom functions:

from functools import lru_cache

class CachedEmbeddingFunction(EmbeddingFunction[Documents]):
    @lru_cache(maxsize=1000)
    def _embed_single(self, text: str) -> List[float]:
        # Expensive embedding operation
        return compute_embedding(text)
    
    def __call__(self, input: Documents) -> List[List[float]]:
        return [self._embed_single(doc) for doc in input]

Troubleshooting

Dimension Mismatch

# Error: embedding dimensions don't match
# Solution: ensure all embeddings have same dimension

API Rate Limits

from tenacity import retry, wait_exponential

class RateLimitedEmbeddingFunction(EmbeddingFunction[Documents]):
    @retry(wait=wait_exponential(multiplier=1, min=4, max=60))
    def __call__(self, input: Documents) -> List[List[float]]:
        # API call with automatic retries
        return api_call(input)

Memory Issues

For large batches, process in chunks:

def __call__(self, input: Documents) -> List[List[float]]:
    batch_size = 100
    embeddings = []
    for i in range(0, len(input), batch_size):
        batch = input[i:i + batch_size]
        embeddings.extend(self._embed_batch(batch))
    return embeddings

Get Started

Core Concepts

Guides

Deployment

Operations

Embedding Functions

Default Embedding Function

Built-in Embedding Functions

OpenAI Embeddings

Cohere Embeddings

Hugging Face Embeddings

Instructor Embeddings

Google Gemini Embeddings

Amazon Bedrock Embeddings

Custom Embedding Functions

Custom Function Requirements

Advanced Custom Function

Embedding Function Configuration

Distance Metrics

Query vs. Document Embeddings

Working with Multiple Modalities

Multimodal Embeddings

Performance Considerations

Batch Processing

Caching

Troubleshooting

Dimension Mismatch

API Rate Limits

Memory Issues

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Deployment

Operations

Documentation Index

​Default Embedding Function

​Built-in Embedding Functions

​OpenAI Embeddings

​Cohere Embeddings

​Hugging Face Embeddings

​Instructor Embeddings

​Google Gemini Embeddings

​Amazon Bedrock Embeddings

​Custom Embedding Functions

​Custom Function Requirements

​Advanced Custom Function

​Embedding Function Configuration

​Distance Metrics

​Query vs. Document Embeddings

​Working with Multiple Modalities

​Multimodal Embeddings

​Performance Considerations

​Batch Processing

​Caching

​Troubleshooting

​Dimension Mismatch

​API Rate Limits

​Memory Issues

Build docs developers (and LLMs) love

Default Embedding Function

Built-in Embedding Functions

OpenAI Embeddings

Cohere Embeddings

Hugging Face Embeddings

Instructor Embeddings

Google Gemini Embeddings

Amazon Bedrock Embeddings

Custom Embedding Functions

Custom Function Requirements

Advanced Custom Function

Embedding Function Configuration

Distance Metrics

Query vs. Document Embeddings

Working with Multiple Modalities

Multimodal Embeddings

Performance Considerations

Batch Processing

Caching

Troubleshooting

Dimension Mismatch

API Rate Limits

Memory Issues