What is Metadata?
Metadata is structured information attached to each record in your collection. It consists of key-value pairs that describe attributes of your documents:Supported Data Types
Chroma supports several metadata value types:Primitive Types
List Types
Metadata values can be lists of primitive types:Lists must be:
- Non-empty
- Homogeneous (all elements same type)
- Contain only
str,int,float, orbool
Sparse Vectors
Store sparse vector data efficiently in metadata:- Must have sorted, unique indices
- All arrays must have matching lengths
- Validated automatically via
__post_init__
Reserved Keys
Chroma reserves certain metadata keys for internal use:chroma: to prevent conflicts.
Adding Metadata
Single Record
Multiple Records
Without Documents
You can add records with embeddings and metadata but no documents:Updating Metadata
Update metadata for existing records:Filtering with Metadata
Use thewhere parameter to filter results based on metadata:
Basic Equality
Comparison Operators
List Operators
Array Membership Operators
Logical Operators
Combine multiple conditions:Filtering Documents
Filter based on document content usingwhere_document:
Combining Metadata and Document Filters
Querying with Filters
Combine similarity search with metadata filtering:- Are in the “research” category
- Were published in 2020 or later
- Contain “neural network” in the text
Type Definitions
Fromchromadb/base_types.py:
Validation
Chroma validates metadata to ensure data integrity:Best Practices
Use consistent metadata schemas
Use consistent metadata schemas
Define a consistent structure for metadata across records:
Index frequently filtered fields
Index frequently filtered fields
Plan your metadata based on how you’ll query:
Use appropriate data types
Use appropriate data types
Choose the right type for your data:
Normalize metadata values
Normalize metadata values
Use consistent formatting and casing:
Next Steps
Querying
Learn about querying with metadata filters
Collections
Understand collection organization
Filtering Guide
Advanced filtering techniques
API Reference
Collection API methods