Swarm vs. Pipeline: How Pentest Swarm AI Really Works

Pentest Swarm AI is not a pipeline with a fancier name. Most “multi-agent” pentesting tools route work through a fixed sequence — recon feeds classify, classify feeds exploit, exploit feeds report — with a central planner dispatching each step. Pentest Swarm AI replaces that design with a genuine swarm: agents share an environment, each agent’s writes influence every other agent’s behaviour, and the useful attack paths emerge from that shared state rather than from any script that prescribes them.

The Pipeline Problem

A fixed recon → classify → exploit → report pipeline has a structural ceiling. Every path the campaign can take must be anticipated in advance and encoded in the orchestrator. Parallelism is limited to what the author of the pipeline planned for. If recon surfaces an unexpected technology stack, the pipeline can only follow routes already wired in. Partial results from one phase cannot begin feeding the next until the full phase completes. More practically: pipelines are brittle. Adding a new capability means editing the orchestrator. Removing an agent risks breaking the sequencing logic. The whole thing is only as smart as whoever wrote the dispatch code. Swarm intelligence addresses these limits at the architecture level. Because coordination happens through shared state rather than function calls, agents are genuinely independent. New agents can join the swarm by declaring a trigger predicate, with no changes to any existing code.

Three Swarm Primitives

Stigmergy

In biology, stigmergy is coordination through environmental modification — ants lay pheromone trails that other ants follow. Pentest Swarm AI uses the same mechanism. Agents do not talk to each other. They write findings to a shared blackboard, and every finding carries a pheromone weight that biases other agents toward it. When the classifier writes a high-severity CVE_MATCH, its pheromone weight signals the exploit agent that this finding is worth acting on immediately. When a SESSION token ages out, its decayed weight means agents naturally stop spending resources on it. Coordination is an emergent property of writes and reads, not a property of the orchestrator.

Emergence

Attack chains appear that no single agent planned. A recon finding wakes the classifier. A high-severity classification wakes the exploit agent. Exploit results land back on the board and wake the report agent. The sequence is not prescribed — it follows from the pheromone state of the blackboard at any given moment. In practice this means a 1,000-subdomain target can be processed without anyone writing a plan for it. The swarm self-organises around what the blackboard contains.

Decentralization

Each agent is defined by its trigger predicate — a set of conditions on the blackboard that, when satisfied, cause the agent to be dispatched against a matching finding. There is no central planner that routes work. The scheduler is a thin coordinator: it enforces concurrency caps, enforces scope, and listens for shutdown signals. Selection is emergent from trigger rules and pheromone weights. Adding a custom agent is a matter of implementing the Agent interface with a Trigger() predicate and a Handle() function. No orchestrator code changes.

Architecture Diagram

                     YOU
                      |
               pentestswarm scan example.com --swarm
                      |
           ┌──────────▼──────────┐
           │   SEED: TARGET_REG  │
           └──────────┬──────────┘
                      ▼
 ┌────────────────────────────────────────────────────────┐
 │              SHARED BLACKBOARD (pgvector)              │
 │                                                        │
 │   SUBDOMAIN · PORT_OPEN · HTTP_ENDPOINT · TECHNOLOGY   │
 │   CVE_MATCH · MISCONFIGURATION · EXPLOIT_CHAIN         │
 │   EXPLOIT_RESULT · CAMPAIGN_COMPLETE                   │
 │                                                        │
 │   (each finding has a pheromone weight that decays)    │
 └──┬─────────────┬─────────────┬─────────────┬───────────┘
    │             │             │             │
    │ triggers:   │ triggers:   │ triggers:   │ triggers:
    │ TARGET_REG  │ raw recon + │ CVE_MATCH   │ CAMPAIGN_
    │             │ pheromone>  │ pheromone>  │ COMPLETE
    │             │ 0.2         │ 0.5         │
    ▼             ▼             ▼             ▼
┌─────────┐  ┌─────────┐   ┌─────────┐   ┌─────────┐
│  RECON  │  │CLASSIFY │   │ EXPLOIT │   │ REPORT  │
│         │  │         │   │         │   │         │
│ runs 8  │  │ maps    │   │ builds  │   │ queries │
│ tools,  │  │ CVEs,   │   │ attack  │   │ board   │
│ writes  │  │ scores  │   │ chains  │   │ →md/    │
│ per     │  │ CVSS,   │   │ per     │   │ html/   │
│ finding │  │ writes  │   │ finding │   │ json/   │
└─────────┘  └─────────┘   └─────────┘   │ sarif   │
                                          └─────────┘

Two Execution Modes

Sequential Runner (default)
Stigmergic Swarm (--swarm)

The default execution mode runs a deterministic 5-phase pipeline: seed → recon → classify → exploit → report. Each phase completes before the next begins.

pentestswarm scan example.com --scope example.com

This mode is marked stable in the feature table and is the battle-tested path for straightforward engagements where predictable sequencing is preferred over emergent behaviour.The runner lives in internal/engine/runner.go. Cleanup is always registered before execution begins, so SIGINT, crashes, and budget exhaustion all trigger reverse-order cleanup.

The swarm execution mode replaces the phase loop with a blackboard-driven scheduler. Agents run in parallel goroutines, each subscribed to its own trigger predicate. The scheduler in internal/engine/swarm_runner.go seeds the blackboard, then steps back — agents self-organise from there.

pentestswarm scan example.com --scope example.com --swarm

The swarm terminates when either:

A CAMPAIGN_COMPLETE finding is written to the blackboard (by the budget enforcer or by time expiry), or
The campaign context is cancelled (SIGINT, deadline).

The default swarm time budget is 20 minutes. Budget exhaustion triggers a CAMPAIGN_COMPLETE write so the report agent fires on partial state — you get a partial report rather than empty output.This mode is currently marked alpha — memory-backed blackboard is wired; Postgres backend is beta.

Key Behaviours

Agents are independent

Any agent can be removed, replaced, or added without rewiring the others. Each agent declares its own trigger predicate. The scheduler subscribes each agent to its predicate and dispatches findings — no agent needs to know any other agent exists.

Pheromones decay per finding type

A PORT_OPEN stays hot for 1 hour. A TARGET_REGISTERED stays hot for 24 hours. A SESSION token decays in 15 minutes. Half-lives are config-driven via config/pheromones.yaml and can be overridden per deployment. Stale paths die naturally without any garbage collection logic.

Scope enforced at the tool layer and executor

The --scope flag is not bypassable. Every tool adapter routes through scope.ValidateAndLog() before spawning any subprocess. The executor performs a second validation pass. Violations emit a WARN subsystem=scope log event and return an error — they never silently continue.

Cleanup always registered before execution

Every exploit that creates artifacts — files, users, sessions — registers a cleanup entry before touching the target. Cleanup runs on normal exit, SIGINT, and scheduler crash. The cleanup context is detached from the run context so cancellation does not orphan cleanup jobs.

Prompt caching on Claude

System prompts are cached by default for the recon and classifier agents. Cache-hit metrics are emitted via Usage.CacheHitRate(). This cuts cost and latency on repeated prompts without any code change.

Comparison with the Ecosystem

Tool	Architecture	Executes vs. suggests	Memory	Tools wired	MCP	Swarm?
Pentest Swarm AI	Stigmergic blackboard	Executes	pgvector + pheromones	8 ProjectDiscovery + nmap; sqlmap / Burp MCP / Metasploit in roadmap	Yes	✅ real
PentestGPT	Single-agent ReAct	Suggests	None	None native	No	No
HackingBuddyGPT	Single-agent	Executes	Run logs	Shell passthrough	No	No
PentAGI	4 agents + planner	Executes	pgvector	40+ via MCP/shell	Partial	Pipeline
Shannon	White-box + browser	Executes	Session state	Browser DOM	No	Pipeline
HexStrike	MCP tool wrapper	Delegates to client LLM	None (stateless)	150+ via MCP	Yes	No
Pentest-R1	RL-tuned LLM	Executes	Trajectory	CTF-scope	No	No

Feature Status

Feature	Status	Notes
Sequential 5-phase runner	stable	Default mode; battle-tested core
Stigmergic swarm scheduler	alpha	`--swarm` flag; memory-backed blackboard wired
ProjectDiscovery toolchain	stable	subfinder, httpx, nuclei, naabu, katana, dnsx, gau
`nmap` adapter	stable	XML parsed; scope-validated
Cleanup registry	stable	Always runs on SIGINT / exit / budget-cancel
Claude prompt caching	stable	Enabled for recon + classifier by default
`--strict` LLM mode	stable	Promotes LLM errors to fatal
CVSS v3.1 scoring	stable	FIRST spec
Postgres blackboard backend	beta	Migration shipped; runner uses memory-board for now
MCP server	beta	`pentestswarm mcp serve`
VS Code extension	beta	`deploy/vscode/`
GitHub Action	beta	`deploy/github-action/action.yml` with SARIF
Swarm playbooks (5)	beta	`playbooks/{bug-bounty,external-asm,ci-cd,internal-network,ctf-solver}.yaml`
Live dashboard	alpha	`web/`; UI built, wiring to live campaigns in progress
Burp MCP bridge	planned	Wave 2
Metasploit / ZAP / sqlmap adapters	planned	Wave 2
Fine-tuned Pentest-Swarm model	planned	Wave 3 (Pentest-R1 recipe)
Cybench / AutoPenBench benchmarks	planned	Wave 3

Get Started

Core Concepts

CLI Reference

Guides

Security & Operations

Swarm vs. Pipeline: How Pentest Swarm AI Really Works

The Pipeline Problem

Three Swarm Primitives

Stigmergy

Emergence

Decentralization

Architecture Diagram

Two Execution Modes

Key Behaviours

Comparison with the Ecosystem

Feature Status

Build docs developers (and LLMs) love

Get Started

Core Concepts

CLI Reference

Guides

Security & Operations

Documentation Index

​The Pipeline Problem

​Three Swarm Primitives

​Stigmergy

​Emergence

​Decentralization

​Architecture Diagram

​Two Execution Modes

​Key Behaviours

​Comparison with the Ecosystem

​Feature Status

Build docs developers (and LLMs) love

The Pipeline Problem

Three Swarm Primitives

Stigmergy

Emergence

Decentralization

Architecture Diagram

Two Execution Modes

Key Behaviours

Comparison with the Ecosystem

Feature Status