Core Types
IDs
Record identifiers.A single record identifier.
A list of record identifiers.
Documents
Text documents.A single text document.
A list of text documents.
Embeddings
Vector embeddings.A single embedding vector as numpy array.
A list of embedding vectors.
A single embedding as Python list (alternative format).
Multiple embeddings as nested Python lists.
Metadata
Record metadata.Metadata dictionary. Values can be:
- Strings
- Numbers (int or float)
- Booleans
- Lists of strings, numbers, or booleans (homogeneous)
- SparseVector objects
A list of metadata dictionaries.
Metadata for updates. Same as Metadata but allows None values to unset fields.
SparseVector
Sparse vector representation.Non-zero indices (must be sorted in ascending order).
Corresponding values (must be same length as indices).
Indices must be non-negative, sorted, and have the same length as values.
URIs
Uniform Resource Identifiers for external data.A single URI string.
A list of URI strings.
Query Filters
Where
Metadata filter for querying.Metadata filter using MongoDB-style query operators:Comparison Operators:
{"field": value}- Equality (shorthand){"field": {"$eq": value}}- Equality{"field": {"$ne": value}}- Not equal{"field": {"$gt": value}}- Greater than{"field": {"$gte": value}}- Greater than or equal{"field": {"$lt": value}}- Less than{"field": {"$lte": value}}- Less than or equal
{"field": {"$in": [values]}}- In list{"field": {"$nin": [values]}}- Not in list
{"field": {"$contains": value}}- Array contains value{"field": {"$not_contains": value}}- Array does not contain value
{"$and": [conditions]}- All conditions must match{"$or": [conditions]}- Any condition must match
WhereDocument
Document content filter for querying.Document content filter using operators:String Operators:
{"$contains": "text"}- Document contains substring{"$not_contains": "text"}- Document does not contain substring{"$regex": "pattern"}- Document matches regex{"$not_regex": "pattern"}- Document does not match regex
{"$and": [conditions]}- All conditions must match{"$or": [conditions]}- Any condition must match
Result Types
GetResult
Result fromcollection.get() operations.
List of record IDs (always included).
List of documents (if included).
List of metadata dictionaries (if included).
List of embeddings (if included).
List of URIs (if included).
List of fields that were included in the query.
QueryResult
Result fromcollection.query() operations.
Nested list of IDs (one list per query).
Nested list of documents (if included).
Nested list of metadata dictionaries (if included).
Nested list of embeddings (if included).
Nested list of distances (if included).
List of fields that were included in the query.
SearchResult
Result fromcollection.search() operations (experimental).
Nested list of IDs (one list per search).
Nested list of documents (if selected).
Nested list of embeddings (if selected).
Nested list of metadata dictionaries (if selected).
Nested list of ranking scores (if scoring is used).
List of selected keys for each payload.
rows(): Convert column-major format to row-major format, returningList[List[SearchResultRow]]
IndexingStatus
Indexing progress information.Number of user operations that have been indexed.
Number of user operations pending indexing.
Total number of user operations in the collection.
Proportion of operations indexed (value between 0.0 and 1.0).
Schema Configuration Types
Schema
Collection schema for indexing and encryption configuration.VectorIndexConfig
Configuration for dense vector indexes.Distance metric:
"cosine", "l2" (Euclidean), or "ip" (inner product).Embedding function for the index.
Source key to extract vectors from. Defaults to
"#document" for the default embedding.HNSW algorithm configuration.
SPANN algorithm configuration.
HnswIndexConfig
Hierarchical Navigable Small World (HNSW) algorithm configuration.Size of candidate list during index construction. Higher values improve quality but increase build time.
Maximum number of neighbors per node (M parameter). Higher values improve recall but increase memory usage.
Size of candidate list during search. Higher values improve recall but increase query time.
Number of threads for index construction.
Batch size for index operations.
Threshold for syncing index to disk.
Factor for resizing the index.
FtsIndexConfig
Full-text search index configuration.FtsIndexConfig has no configurable parameters. Use it to enable full-text search on document fields.
SparseVectorIndexConfig
Sparse vector index configuration.Sparse embedding function for the index.
Source key to extract sparse vectors from.
Enable BM25 weighting for sparse vectors.
Inverted Index Configs
Configuration for inverted indexes on metadata fields.These indexes are automatically enabled on metadata fields by default. No configuration required.
Embedding Functions
EmbeddingFunction
Protocol for implementing custom embedding functions.Encryption
Cmek
Customer-managed encryption key for collection data.Cloud provider (currently only
CmekProvider.GCP supported).Provider-specific resource identifier for the encryption key.
gcp(resource: str): Create a CMEK for Google Cloud Platformvalidate_pattern(): Validate the resource name formatto_dict(): Serialize to dictionaryfrom_dict(data: dict): Deserialize from dictionary