Environment configuration

All runtime configuration for SoftArchitect AI lives in a single .env file at the repository root. This file is never committed to version control.

Creating your .env file

# From the repository root
cp .env.example .env

Then open .env in your editor and fill in any API keys and adjust resource limits for your hardware.

The .env.example file at the repository root is the authoritative template. It is committed to version control to document every available variable and its default value. Never edit .env.example directly for local use — always copy it to .env first.

Variable reference

0. Global app settings

Variable	Default	Description
`PROJECT_NAME`	`"SoftArchitect AI"`	Display name used in logs and API responses.
`ENVIRONMENT`	`development`	Runtime environment. Set to `production` for deployed instances.
`DEBUG`	`False`	Enables verbose debug output. Do not enable in production.
`LOG_LEVEL`	`INFO`	Logging verbosity. Accepts `DEBUG`, `INFO`, `WARNING`, `ERROR`.

1. Docker resources and images

These variables control which Docker image versions are pulled and how much memory each container may use.

Variable	Default	Description
`PYTHON_VERSION`	`3.12.3`	Python version used to build the `sa_api` image.
`OLLAMA_IMAGE_VERSION`	`latest`	Ollama Docker image tag. Pin to a specific version for reproducible builds.
`CHROMADB_IMAGE_VERSION`	`latest`	ChromaDB Docker image tag.
`OLLAMA_MEMORY_LIMIT`	`2GB`	Maximum RAM for the `sa_ollama` container. Increase to `4GB` if you have 16 GB+ RAM.
`OLLAMA_CPU_SHARES`	`1024`	Relative CPU weight for `sa_ollama`.
`CHROMADB_MEMORY_LIMIT`	`512MB`	Maximum RAM for the `sa_chromadb` container.
`API_MEMORY_LIMIT`	`512MB`	Maximum RAM for the `sa_api` container.

2. LLM provider selection

Variable	Default	Description
`LLM_PROVIDER`	`gemini`	Active LLM backend. Valid values: `gemini`, `groq`, `ollama`.

Set LLM_PROVIDER=ollama for fully private, offline operation. Set it to gemini or groq for faster cloud-based inference with minimal hardware requirements.

3. Gemini configuration

Used when LLM_PROVIDER=gemini.

Variable	Default	Description
`GEMINI_API_KEY`	`your_gemini_api_key_here`	API key from Google AI Studio. Required for Gemini mode.
`GEMINI_MODEL`	`gemini-3.1-flash-lite-preview`	Gemini model identifier.

4. Ollama configuration

Used when LLM_PROVIDER=ollama.

Variable	Default	Description
`OLLAMA_MODEL`	`llama3.2`	Model to load for inference. Recommended: `qwen2.5-coder:7b`, `llama3.2`, `phi4-mini`.
`OLLAMA_BASE_URL`	`http://ollama:11434`	Ollama HTTP endpoint. This is the internal Docker network address used by `sa_api` to reach `sa_ollama`.

The OLLAMA_BASE_URL value in your .env is overridden by a hardcoded environment variable in docker-compose.yml (http://sa_ollama:11434). You only need to change OLLAMA_BASE_URL if you are running Ollama outside of Docker.

5. Groq configuration

Used when LLM_PROVIDER=groq.

Variable	Default	Description
`GROQ_API_KEY`	`your_groq_api_key_here`	API key from Groq Console. Required for Groq mode.
`GROQ_MODEL`	`llama-3.3-70b-versatile`	Groq model identifier.

6. Vector DB (ChromaDB)

Variable	Default	Description
`CHROMADB_HOST`	`chromadb`	ChromaDB hostname as seen from the `sa_api` container on `sa_network`.
`CHROMADB_PORT`	`8000`	ChromaDB internal port. Exposed to the host as `8001`.
`CHROMADB_DATA_PATH`	`/data/chromadb`	Path inside the ChromaDB container where the vector index is persisted.

7. RAG and context limits

These variables prevent out-of-memory crashes when using models with small context windows.

Variable	Default	Ollama 8K model	Gemini / Groq
`LLM_MAX_PROMPT_CHARS`	`200000`	`30000`	`200000`
`RAG_MAX_CHUNKS`	`3`	`2`	`3`–`5`

LLM_MAX_PROMPT_CHARS is the maximum number of characters the prompt may contain (approximately tokens × 4). RAG_MAX_CHUNKS is the number of context fragments retrieved from ChromaDB per query.

# .env — local Ollama with 8K context window
LLM_MAX_PROMPT_CHARS=30000
RAG_MAX_CHUNKS=2

# .env — cloud API (Gemini or Groq)
LLM_MAX_PROMPT_CHARS=200000
RAG_MAX_CHUNKS=5

8. Privacy, security, and memory

Variable	Default	Description
`IRON_MODE`	`True`	When enabled, enforces strict data sovereignty rules — no external API calls are made unless `LLM_PROVIDER` explicitly uses a cloud backend.
`PII_DETECTION_ENABLED`	`True`	Scans user inputs for personally identifiable information before they are passed to the LLM.
`CHAT_MAX_HISTORY_MESSAGES`	`50` (template) / `100` (code)	Maximum number of messages retained in the conversation history to prevent context saturation. `.env.example` sets `50`.
`CHAT_MAX_MESSAGE_LENGTH`	`20000`	Maximum character length of a single user message.

Do not set IRON_MODE=False unless you fully understand the privacy implications. This flag is the primary enforcement gate that prevents accidental data leakage to external services.

Full .env.example reference

For convenience, the complete template as shipped in the repository:

# ─────────────────────────────────────────────────────────────
# 0. GLOBAL APP SETTINGS
# ─────────────────────────────────────────────────────────────
PROJECT_NAME="SoftArchitect AI"
ENVIRONMENT=development
DEBUG=False
LOG_LEVEL=INFO

# ─────────────────────────────────────────────────────────────
# 1. DOCKER RESOURCES & IMAGES
# ─────────────────────────────────────────────────────────────
PYTHON_VERSION=3.12.3
OLLAMA_IMAGE_VERSION=latest
CHROMADB_IMAGE_VERSION=latest

OLLAMA_MEMORY_LIMIT=2GB
OLLAMA_CPU_SHARES=1024
CHROMADB_MEMORY_LIMIT=512MB
API_MEMORY_LIMIT=512MB

# ─────────────────────────────────────────────────────────────
# 2. LLM PROVIDER SELECTION
# ─────────────────────────────────────────────────────────────
LLM_PROVIDER=gemini

# ─────────────────────────────────────────────────────────────
# 3. GEMINI CONFIGURATION (Cloud - Default)
# ─────────────────────────────────────────────────────────────
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_MODEL=gemini-3.1-flash-lite-preview

# ─────────────────────────────────────────────────────────────
# 4. OLLAMA CONFIGURATION (Local/Offline)
# ─────────────────────────────────────────────────────────────
OLLAMA_MODEL=llama3.2
OLLAMA_BASE_URL=http://ollama:11434

# ─────────────────────────────────────────────────────────────
# 5. GROQ CONFIGURATION (Cloud - Fast)
# ─────────────────────────────────────────────────────────────
GROQ_API_KEY=your_groq_api_key_here
GROQ_MODEL=llama-3.3-70b-versatile

# ─────────────────────────────────────────────────────────────
# 6. VECTOR DB (ChromaDB)
# ─────────────────────────────────────────────────────────────
CHROMADB_HOST=chromadb
CHROMADB_PORT=8000
CHROMADB_DATA_PATH=/data/chromadb

# ─────────────────────────────────────────────────────────────
# 7. RAG & CONTEXT LIMITS
# ─────────────────────────────────────────────────────────────
LLM_MAX_PROMPT_CHARS=200000
RAG_MAX_CHUNKS=3

# ─────────────────────────────────────────────────────────────
# 8. PRIVACY, SECURITY & MEMORY
# ─────────────────────────────────────────────────────────────
IRON_MODE=True
PII_DETECTION_ENABLED=True
CHAT_MAX_HISTORY_MESSAGES=50
CHAT_MAX_MESSAGE_LENGTH=20000

Overview

Core Features

Installation & Setup

Guides

Development

Creating your .env file

Variable reference

0. Global app settings

1. Docker resources and images

2. LLM provider selection

3. Gemini configuration

4. Ollama configuration

5. Groq configuration

6. Vector DB (ChromaDB)

7. RAG and context limits

8. Privacy, security, and memory

Full .env.example reference

Build docs developers (and LLMs) love

Overview

Core Features

Installation & Setup

Guides

Development

​Creating your .env file

​Variable reference

​0. Global app settings

​1. Docker resources and images

​2. LLM provider selection

​3. Gemini configuration

​4. Ollama configuration

​5. Groq configuration

​6. Vector DB (ChromaDB)

​7. RAG and context limits

​8. Privacy, security, and memory

​Full .env.example reference

Build docs developers (and LLMs) love

Creating your .env file

Variable reference

0. Global app settings

1. Docker resources and images

2. LLM provider selection

3. Gemini configuration

4. Ollama configuration

5. Groq configuration

6. Vector DB (ChromaDB)

7. RAG and context limits

8. Privacy, security, and memory

Full .env.example reference