Skip to main content
SoftArchitect AI is not just another code chat. It is your shadow software architect. It is an assisted work environment that guides the most critical and often overlooked phase of development: project conception. By leveraging RAG (Retrieval-Augmented Generation) over a curated knowledge base, it ensures your application design complies with SOLID principles, Clean Architecture, and OWASP security guidelines before you write the first line of code. Its mission is clear: transform abstract ideas into development-ready technical specifications, cutting technical debt at the root.

The guided architectural workflow

Every session follows a structured five-stage process that acts as a preventive quality gate:
StagePurpose
ContextCapture project background, team constraints, and existing systems
RequirementsDefine functional and non-functional requirements with the AI
ArchitectureSelect patterns, tech stacks, and validate against best practices
UX/UIDesign user flows and interface decisions grounded in architecture
PlanningBreak the design into actionable development tasks and stories
Each stage builds on the last, so decisions made early — about domain boundaries, data ownership, or security posture — propagate cleanly into the final technical specification.

Why local-first matters

SoftArchitect AI runs entirely on your machine. Your architecture decisions, business requirements, and source context never leave your network unless you explicitly choose a cloud provider. Two modes are available:
  • Privacy mode — Runs inference via Ollama on your local hardware. Zero external API calls.
  • Performance mode — Connects to Groq Cloud or Google Gemini for faster inference on modest hardware, with your explicit opt-in.
The LLM_PROVIDER variable in your .env controls which mode is active.

Key capabilities

Contextual RAG & Tech Packs

A modular “Technical Encyclopedia” (packages/knowledge_base/02-TECH-PACKS) lets the assistant interview you to configure specific stacks — Flutter, Python, Firebase — with precise architecture rules baked in.

Context Factory

Automatically generates technical documentation (AGENTS.md, RULES.md) so your AI coding copilot works with full project context from day one.

Streaming chat

Real-time token streaming from any LLM provider with WebSocket delivery, heartbeat management, and backpressure handling for a fluid, non-blocking experience.

Hardware-agnostic tuning

Two .env variables — LLM_MAX_PROMPT_CHARS and RAG_MAX_CHUNKS — let you match the RAG pipeline to your model’s context window, from an 8K local Ollama model to a 200K-token cloud API.

Technology stack

LayerTechnology
FrontendFlutter Desktop (Linux, Windows, macOS)
BackendPython 3.12 · FastAPI · LangChain
AI engineOllama (local) · Groq Cloud · Google Gemini
Vector storeChromaDB
InfrastructureDocker Compose

Next steps

Quickstart

Get SoftArchitect AI running locally in under 5 minutes.

Architecture

Understand the Clean Architecture and Hexagonal design of the codebase.

Build docs developers (and LLMs) love