Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/ruvnet/ruflo/llms.txt

Use this file to discover all available pages before exploring further.

Ruflo is the execution layer that wraps Claude Code and Codex with everything a working agent needs: tools, memory, coordination loops, sandboxes, and security guardrails. Instead of one model working in isolation, Ruflo wires together a full system in which agents self-organize into swarms, learn from every completed task, and remember successful patterns across sessions. The overall data path follows a single closed loop:
User --> Ruflo (CLI/MCP) --> Router --> Swarm --> Agents --> Memory --> LLM Providers
                          ^                           |
                          +---- Learning Loop <-------+
Each completed task feeds back through the Learning Loop — successful patterns are distilled into memory, which improves routing decisions for the next task. You write code normally; Ruflo manages the rest.

Layer-by-Layer Breakdown

Entry Layer

The first stop for every request. Ruflo exposes two surfaces:
  • CLI — 26 commands and 140+ subcommands. Covers the full agent lifecycle, swarm management, memory operations, neural training, security scanning, and more.
  • MCP Server — 313 tools served over the Model Context Protocol. Registered once with claude mcp add ruflo -- npx ruflo@latest mcp start, then callable directly from Claude Code or any other MCP-compatible client (VS Code, Cursor, Windsurf, Claude Desktop).

AIDefence Security

Every inbound request passes through AIDefence before routing. This layer provides:
  • Request validation — Zod-based schema checks on all inputs
  • Prompt injection blocking — detects and neutralises injection attempts at the boundary
  • PII detection — 14-type pipeline strips sensitive data before it can propagate to agents or leave the node
Threat detection runs in under 10 ms and classifies requests as Safe → Allow, Warning → Sanitise, or Threat → Block.

Routing Layer

After security clearance, the routing layer decides what runs and where:
ComponentRole
Q-Learning RouterLearns from task outcomes; epsilon-greedy exploration; 89% routing accuracy
MoE (Mixture of Experts)8 specialised expert networks; dynamic gating selects the best expert per task type
Skills137+ pre-built skills covering V3 core, swarm, GitHub, SPARC, FlowNexus, and dual-mode workflows
Hooks27 lifecycle hooks fire automatically at task boundaries, session events, and tool calls
The routing layer also runs a Thompson sampling model router (alpha.5+): a cost-adjusted multi-armed bandit that self-corrects across three tiers (Haiku / Sonnet / Opus) using Beta(α, β) priors updated by hooks_model-outcome. After roughly 50 outcomes it stops over-using expensive tiers — no manual threshold tuning needed.

Swarm Coordination

Complex tasks are broken apart by the swarm coordinator and distributed across specialised agents:
ComponentDescription
Topologieshierarchical, mesh, ring, star — chosen based on task complexity
ConsensusRaft (leader-elected, strongly consistent), Byzantine/BFT (tolerates up to ⅓ faulty agents), Gossip (eventually consistent, high-throughput)
ClaimsHuman-agent work ownership protocol with claim, release, and handoff semantics
The hierarchical topology with Raft consensus is the recommended default for coding tasks because a single coordinator validates every output against the original goal, catching drift early.

Agent Layer

The swarm spawns from a pool of 100+ typed, specialised agents. Each agent is optimised for a specific role: coder, tester, reviewer, architect, security, docs, devops, researcher, analyzer, coordinator, queen-coordinator, security-architect, memory-specialist, perf-analyzer, pr-manager, and many more across eight categories. Agents are managed by the AgentPool, which handles auto-scaling, idle timeouts, and health monitoring. Most users never spawn agents manually — the swarm coordinator does it automatically based on task type.

Resources

Three resource types back the agent layer:
  • Memory (AgentDB) — HNSW-indexed vector database; 150x–12,500x faster than brute-force search at scale. Entries persist across sessions and feed the Learning Loop.
  • LLM Providers — Anthropic (Claude), OpenAI (GPT), Google (Gemini), Cohere, and Ollama. Smart routing picks the cheapest provider that meets quality requirements; automatic failover if a provider is unavailable.
  • 12 Background Workersultralearn, audit, optimize, consolidate, map, deepdive, document, refactor, benchmark, testgaps, predict, and preload. They trigger automatically on context signals (file changes, session events, memory thresholds) or can be dispatched manually.

RuVector Intelligence

The intelligence substrate that powers learning across the entire system:
ComponentPurposePerformance
SONASelf-Optimizing Pattern Learning — learns optimal routing from trajectories<0.05 ms adaptation
EWC++Elastic Weight Consolidation — prevents catastrophic forgetting when learning new tasksZero knowledge loss
Flash AttentionOptimised attention computation via @ruvector/attention2.49x–7.47x speedup
HNSWHierarchical Navigable Small World vector searchSub-millisecond retrieval
ReasoningBankPattern storage with RETRIEVE → JUDGE → DISTILL → CONSOLIDATE → ROUTE cycleBM25 + semantic hybrid search
Hyperbolic EmbeddingsPoincaré ball model for hierarchical code relationshipsExponential embedding capacity
LoRA / MicroLoRALow-Rank Adaptation for efficient on-device fine-tuning<5 MB memory footprint (Micro)
Int8 QuantisationConverts 32-bit weights to 8-bit~4× memory reduction
9 RL AlgorithmsPPO, A2C, DQN, Q-Learning, SARSA, Decision Transformer, Curiosity, and moreTask-specific learning

V3 Architecture Decision Records

The V3 rewrite is governed by ten ADRs that codify every major design choice:
ADRDecision
ADR-001Adopt agentic-flow as the core foundation (eliminates 10,000+ duplicate lines)
ADR-002Domain-Driven Design structure with bounded contexts
ADR-003Single coordination engine — UnifiedSwarmCoordinator
ADR-004Plugin-based architecture (microkernel pattern)
ADR-005MCP-first API design across all modules
ADR-006Unified memory service backed by AgentDB
ADR-007Event sourcing for full audit trail on state changes
ADR-008Vitest over Jest (10× faster test runs)
ADR-009Hybrid memory backend (SQLite + AgentDB) as the default
ADR-010Node.js 20+ only — Deno support removed

Performance Reference

MetricTargetAchieved
Event Bus (100k events)<50 ms~6 ms
Map Lookup (100k gets)<20 ms~16 ms
Flash Attention speedup2.49x–7.47xValidated
AgentDB HNSW search150x–12,500x fasterHNSW-indexed
SONA adaptation latency<0.05 ms~0.02 ms
Agent coordination (15 agents)<100 msValidated

Explore the System

Agents

100+ typed agents, lifecycle states, spawning, and pool management.

Swarms

Topology types, consensus algorithms, and hive-mind coordination.

Memory

HNSW vector storage, semantic search, and cross-session persistence.

Hooks

27 lifecycle hooks and 12 background workers that power the learning loop.

Build docs developers (and LLMs) love