Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/ruvnet/ruflo/llms.txt

Use this file to discover all available pages before exploring further.

ruflo neural exposes Ruflo’s RuVector intelligence layer — the collection of algorithms that make the system self-improving. Training stores successful patterns in the ReasoningBank so the SONA router can recall and reuse them on future tasks. The command also surfaces diagnostics for the HNSW vector index, Flash Attention engine, MicroLoRA adapter, and the Thompson sampling model router.

Core concepts

ConceptWhat it is
SONA (Self-Optimizing Neural Architecture)Real-time pattern matching and routing engine. Target: <0.05 ms per routing decision. Benchmark is measured during neural status.
ReasoningBankPersistent store of successful reasoning trajectories. Entries are written by hooks on task completion and recalled by neural patterns.
EWC++ (Elastic Weight Consolidation++)Prevents catastrophic forgetting: old patterns retain their confidence even as new ones are added.
MicroLoRA / LoRALow-rank adaptation of the routing weights. Updated incrementally during neural train without a full model retrain.
Flash AttentionOptimized attention computation. 2.49×–7.47× speedup on attention-heavy sequences (benchmarked).
MoE (Mixture of Experts)8-expert router that activates the top-2 specialists for each input token, reducing unnecessary computation.
Thompson samplingCost-adjusted multi-armed bandit used to route tasks to the cheapest model tier (Haiku / Sonnet / Opus) that can handle them. Beta(α,β) priors are updated by hooks model-outcome.
9 RL algorithmsQ-Learning, SARSA, A2C, PPO, DQN, Decision Transformer, and three others — available for task-specific fine-tuning.

Synopsis

ruflo neural <subcommand> [options]

Subcommands

SubcommandDescription
trainTrain patterns using WASM SIMD acceleration (MicroLoRA + Flash Attention)
statusFull system status: SONA, RuVector, HNSW, embedding model, LoRA adapter
patternsQuery, list, export, or import the pattern store
predictRun a prediction through the current pattern model
optimizeTrigger a manual optimization pass on the routing weights
benchmarkRun a latency and throughput benchmark of the neural stack
listList available models and adapters
exportExport the trained pattern set to JSON
importImport a pattern set from JSON
routerInspect and configure the MoE / Thompson sampling router
distillweight-eft training-data slice: export, plan, eval, and (spend-gated) remote train

train

Train the neural routing weights on a named pattern type. Generates embeddings for the training data, runs the MicroLoRA update loop with optional Flash Attention and contrastive (InfoNCE) loss, and persists trajectories to the ReasoningBank.
ruflo neural train [options]
--pattern / -p
string
default:"coordination"
Pattern type to train. Valid values: coordination, optimization, prediction, security, testing, debugging, memory, reasoning.
--epochs / -e
number
default:"50"
Number of training epochs.
--data / -d
string
Path to a JSON training-data file (array of {content, type} objects). When omitted, Ruflo generates synthetic samples for the selected pattern type.
--learning-rate / -l
number
default:"0.01"
Gradient step size for MicroLoRA updates.
--batch-size / -b
number
default:"32"
Number of embeddings processed per gradient step.
--dim
number
default:"256"
Embedding dimension (capped at 256).
--wasm / -w
boolean
default:"true"
Use the RuVector WASM backend. Falls back to a JS implementation if @ruvector/learning-wasm is not installed.
--flash
boolean
default:"true"
Enable Flash Attention for 2.49×–7.47× speedup on attention computations.
--moe
boolean
default:"false"
Enable Mixture of Experts routing during training.
--hyperbolic
boolean
default:"false"
Use Poincaré ball (hyperbolic) attention for hierarchical pattern structures.
--contrastive
boolean
default:"true"
Train with InfoNCE contrastive loss (anchor + positives vs. negatives).
--curriculum
boolean
default:"false"
Enable curriculum learning: start with easy samples and scale difficulty by epoch.
--backend
string
default:"auto"
Training backend: auto (native @ruvector/ruvllm when available, else WASM), native (requires @ruvector/ruvllm), or wasm.
--val-split
number
default:"0.1"
Fraction of data held out for validation (0–1). Enables early stopping and Best Val Loss reporting. Native backend only.
--resume
string
Path to a native-backend checkpoint file to resume training from. Requires --backend native (or auto resolving to native). Restores epoch position on @ruvector/ruvllm ≥ 2.6.0.
--resume is a native-backend-only feature. Passing it alongside --backend wasm is an explicit error — the command exits immediately with a descriptive message.

status

Measure and display the live state of every component in the neural stack. Runs a 100-sample benchmark of the SONA adaptation time to give a real latency number rather than a static claim.
ruflo neural status [options]
--verbose / -v
boolean
default:"false"
Show extended metrics: trajectory counts, LoRA delta norms, SONA per-operation p99, and ruvllm coordinator stats.
The status table covers:
ComponentWhat is reported
SONA CoordinatorActive/inactive; avg adaptation time in µs
RuVector TrainingBackend (WASM or JS fallback); total MicroLoRA adaptations
SONA EngineTotal learns, total searches
ReasoningBankEntry count, patterns stored
HNSW IndexInitialized / available / not installed; vector count and dimensions
Embedding ModelProvider name and dimensions; semantic vs hash-fallback flag
Flash Attention OpsAvailable operations: batchCosineSim, softmax, topK
Int8 Quantization~4× memory reduction
ruvllm CoordinatorTrajectory count
Contrastive TrainerTriplet count, agent count
Training PipelineBackend version; latest checkpoint path and age
Graph DatabaseNode and edge counts

patterns

Query or manage the pattern store (ReasoningBank entries).
ruflo neural patterns [options]
--action / -a
string
default:"list"
Operation to perform: list, analyze, learn, or predict.
--query / -q
string
Search query for pattern retrieval (used with analyze action).
--limit / -l
number
default:"10"
Maximum number of patterns to return.

predict

Run input text through the trained pattern model and return a routing prediction with confidence scores.
ruflo neural predict --input "<text>" [options]
--input / -i
string
required
Input text to predict routing for.
--k / -k
number
default:"5"
Number of top predictions to return.

benchmark

Run a latency and throughput benchmark of the entire neural stack and produce a table of operations-per-second for each component.
ruflo neural benchmark [--iterations <n>] [--dim <d>] [--keys <k>]
--iterations / -i
number
default:"1000"
Number of benchmark iterations per component.
--dim / -d
number
default:"256"
Embedding dimension to benchmark (capped at 256).
--keys / -k
number
default:"100"
Number of attention keys for attention benchmarks.

distill

The distill subgroup manages the weight-eft training-data pipeline (ADR-150). All operations are $0 except distill train --execute --yes, which triggers real remote-GPU compute on a user-provided host.
ruflo neural distill <export|plan|eval|train>
SubcommandDescription
distill exportConvert captured session transcripts to audited SFT/DPO JSONL + a guard report
distill planPrint the GPU training plan and ruvllm commands (offline dry-run, $0)
distill evalCompute the cost-Pareto delta between a base-model run and an adapter run
distill trainRemote-GPU LoRA tune via SSH (dry-run by default; --execute --yes to spend)
distill train --execute --yes initiates real GPU compute on the remote host you specify. The dry-run (default) is fully offline and contacts nothing. The --preflight flag adds read-only SSH probes without training.

Examples

# Train coordination patterns with default settings (50 epochs, Flash Attention)
ruflo neural train -p coordination

# Security patterns with contrastive learning (100 epochs)
ruflo neural train -p security -e 100 --contrastive

# Train from a custom data file
ruflo neural train -d ./my-training-data.json --flash

# MoE + hyperbolic attention for hierarchical code patterns
ruflo neural train -p coordination --moe --hyperbolic -e 200

# Native backend with validation split and early stopping
ruflo neural train -p optimization --backend native --val-split 0.2 -e 500

# Resume from a previous checkpoint
ruflo neural train -p testing --backend native \
  --resume .claude-flow/neural/lora-checkpoint-1719000000000.json
After ruflo init, run ruflo hooks pretrain before ruflo neural train. Pretraining builds the initial ReasoningBank from your existing codebase; neural train then uses those entries as a warm starting distribution, producing better routing accuracy with fewer epochs.
The SONA <0.05 ms target is measured on the adaptation benchmark (benchmarkAdaptation(100)). If neural status reports the target is not met (Target Met: No), try reducing CLAUDE_FLOW_HNSW_EF or lowering the embedding dimension with --dim 128 on the next training run.

Build docs developers (and LLMs) love