Required software
Install Docker 20.10+
Docker is required to run all three services (API, ChromaDB, Ollama) in isolated containers.Download from the official Docker site.
Install Docker Compose 2.0+
SoftArchitect AI uses the Docker Desktop includes Compose v2 by default. On Linux, install it via the Docker Compose plugin guide.
docker compose (v2) CLI plugin, not the legacy docker-compose binary.Optional software
Ollama (local LLM mode)
Install Ollama on your host machine if you want to run LLM inference natively outside Docker. For the fully containerised setup, Ollama runs inside the
sa_ollama container automatically — no host installation needed.API keys (cloud mode)
If you prefer cloud inference, obtain an API key from Google AI Studio (Gemini) or Groq (Groq). Cloud mode requires only 4 GB RAM on the host.
Hardware requirements
| Mode | Minimum RAM | Recommended RAM | Notes |
|---|---|---|---|
| Local Ollama | 8 GB | 16 GB | Ollama container default memory limit: 2GB (adjustable via OLLAMA_MEMORY_LIMIT) |
| Cloud API (Gemini / Groq) | 4 GB | 8 GB | No local model weights downloaded |
The
validate-docker-setup.sh script in infrastructure/ warns you when available RAM appears to be below 8 GB and suggests reducing OLLAMA_MEMORY_LIMIT in your .env file.Disk space
| Resource | Approximate size |
|---|---|
| Docker images (API + ChromaDB + Ollama) | ~3 GB |
LLM model weights (e.g., llama3.2) | 2–8 GB depending on the model |
| ChromaDB vector index | grows with usage |
Operating system support
Linux
Fully supported. Native Docker engine provides best performance for local LLM inference.
macOS
Supported via Docker Desktop. Apple Silicon (M1/M2/M3) can use Metal GPU acceleration with Ollama.
Windows
Supported via Docker Desktop with WSL 2 backend. NVIDIA GPU pass-through requires the NVIDIA Container Toolkit for WSL.