The guided architectural workflow
Every session follows a structured five-stage process that acts as a preventive quality gate:| Stage | Purpose |
|---|---|
| Context | Capture project background, team constraints, and existing systems |
| Requirements | Define functional and non-functional requirements with the AI |
| Architecture | Select patterns, tech stacks, and validate against best practices |
| UX/UI | Design user flows and interface decisions grounded in architecture |
| Planning | Break the design into actionable development tasks and stories |
Why local-first matters
SoftArchitect AI runs entirely on your machine. Your architecture decisions, business requirements, and source context never leave your network unless you explicitly choose a cloud provider. Two modes are available:- Privacy mode — Runs inference via Ollama on your local hardware. Zero external API calls.
- Performance mode — Connects to Groq Cloud or Google Gemini for faster inference on modest hardware, with your explicit opt-in.
LLM_PROVIDER variable in your .env controls which mode is active.
Key capabilities
Contextual RAG & Tech Packs
A modular “Technical Encyclopedia” (
packages/knowledge_base/02-TECH-PACKS) lets the assistant interview you to configure specific stacks — Flutter, Python, Firebase — with precise architecture rules baked in.Context Factory
Automatically generates technical documentation (
AGENTS.md, RULES.md) so your AI coding copilot works with full project context from day one.Streaming chat
Real-time token streaming from any LLM provider with WebSocket delivery, heartbeat management, and backpressure handling for a fluid, non-blocking experience.
Hardware-agnostic tuning
Two
.env variables — LLM_MAX_PROMPT_CHARS and RAG_MAX_CHUNKS — let you match the RAG pipeline to your model’s context window, from an 8K local Ollama model to a 200K-token cloud API.Technology stack
| Layer | Technology |
|---|---|
| Frontend | Flutter Desktop (Linux, Windows, macOS) |
| Backend | Python 3.12 · FastAPI · LangChain |
| AI engine | Ollama (local) · Groq Cloud · Google Gemini |
| Vector store | ChromaDB |
| Infrastructure | Docker Compose |
Next steps
Quickstart
Get SoftArchitect AI running locally in under 5 minutes.
Architecture
Understand the Clean Architecture and Hexagonal design of the codebase.