Deployment Modes
In-Memory Mode (Ephemeral)
The simplest way to use Chroma for testing and prototyping. All data is stored in memory and lost when the process ends.- Quick testing and experimentation
- Temporary data that doesn’t need persistence
- Development and prototyping
- No disk I/O
- Fastest performance
- Data is not persisted
- Single process only
Persistent Mode
Stores data on disk for local development with data persistence. Data survives process restarts.- Local development with persistent data
- Small-scale applications
- Single-machine deployments
- Data persisted to disk
- SQLite-based metadata storage
- Local file-based storage
- Single process recommended
- Default path:
./chroma
Client-Server Mode
Connect to a Chroma server over HTTP. Recommended for production deployments.- Production deployments
- Multi-client access
- Remote access to Chroma
- Scalable workloads
- Multiple clients can connect
- Network-based communication
- Supports authentication
- Can be deployed with Docker, Kubernetes, or cloud providers
Distributed Mode
A microservices-based architecture for production-scale deployments. Uses Kubernetes for orchestration. Architecture components:- Frontend Service: HTTP API endpoint (Rust-based)
- Query Service: Handles vector similarity search
- Compaction Service: Optimizes data storage
- Log Service: Manages write-ahead log
- SysDB: Metadata and catalog management
- Garbage Collector: Cleans up unused data
- High-scale production deployments
- Multi-tenant environments
- High availability requirements
- Geographic distribution
- Kubernetes-native
- Horizontally scalable
- Service-based architecture
- Production-grade observability
- High availability
Choosing the Right Deployment
| Mode | Development | Testing | Production | Multi-User | Scale |
|---|---|---|---|---|---|
| In-Memory | ✅ | ✅ | ❌ | ❌ | Small |
| Persistent | ✅ | ✅ | ⚠️ | ⚠️ | Small |
| Client-Server | ✅ | ✅ | ✅ | ✅ | Medium |
| Distributed | ⚠️ | ✅ | ✅ | ✅ | Large |
Decision Tree
Start here: Do you need data persistence?- No → Use In-Memory mode
- Yes → Continue
- No → Use Persistent mode
- Yes → Continue
- < 1M vectors → Use Client-Server mode with Docker
- 1M - 100M vectors → Use Client-Server mode with Kubernetes
- > 100M vectors → Use Distributed mode
Local Development with Tilt
For developers working on Chroma or testing distributed deployments locally, use Tilt:Requirements
Setup
-
Start Kubernetes
-
Start Distributed Chroma
-
Access Services
- Chroma API: http://localhost:8000
- Tilt Dashboard: http://localhost:10350
-
Clean Up
What Tilt Provides
Tilt automatically:- Builds all service images from source
- Deploys a complete distributed Chroma cluster
- Sets up observability stack (Grafana, Jaeger, Prometheus)
- Hot-reloads code changes
- Exposes port forwards for all services
- Frontend (HTTP): 8000
- Query Service: 50053
- Log Service: 50054
- SysDB: 50051
- Grafana: (via Tilt dashboard)
- Jaeger: (via Tilt dashboard)
Testing Requirement: When running tests locally, make sure
tilt up is running. Some distributed Chroma tests will fail without it.Environment Variables
Common environment variables across all deployment modes:Next Steps
Docker Deployment
Deploy Chroma using Docker and Docker Compose
Kubernetes Deployment
Deploy Chroma on Kubernetes using Helm
Cloud Providers
Deploy Chroma on AWS, GCP, or Azure
Configuration
Learn about Chroma configuration options