Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Armur-Ai/Pentest-Swarm-AI/llms.txt
Use this file to discover all available pages before exploring further.
Pentest Swarm AI deploys five specialist agents, each responsible for a distinct phase of a penetration test. Rather than being called in sequence by a central planner, each agent declares a trigger predicate — a set of conditions on the shared blackboard — and is dispatched automatically when matching findings appear. Every agent that interacts with an LLM uses a ReAct (Reason + Act) loop: the model reasons about the current finding, decides which tool to invoke, observes the result, and iterates until it has enough information to write a conclusion back to the blackboard.
Agent Roster
1. Seed Agent
The Seed agent is not a long-running worker — it is a function called once by the engine at campaign start. Without a seed there is nothing on the board, so no trigger predicate ever fires.
Implementation: internal/swarm/agents/seed.go — agents.Seed()
What it writes:
| Finding type | Pheromone base | Half-life |
|---|
TARGET_REGISTERED | 1.0 | 24 hours |
// Seed writes the initial TARGET_REGISTERED finding that kicks the swarm off.
func Seed(
ctx context.Context,
board blackboard.Board,
campaignID uuid.UUID,
target, objective string,
tun *tuning.Settings,
) error {
data, _ := json.Marshal(map[string]any{
"target": target,
"objective": objective,
})
base, half := tun.Lookup(blackboard.TypeTargetRegistered)
_, err := board.Write(ctx, blackboard.Finding{
CampaignID: campaignID,
AgentName: "engine",
Type: blackboard.TypeTargetRegistered,
Target: target,
Data: data,
PheromoneBase: base,
HalfLifeSec: half,
})
return err
}
2. Recon Agent
The Recon agent is the first specialist woken by the swarm. It triggers on the TARGET_REGISTERED finding written by the seed, runs the full ProjectDiscovery tool stack against the target, and fans out one blackboard finding per discovered asset.
Trigger predicate:
blackboard.Predicate{
Types: []blackboard.FindingType{blackboard.TypeTargetRegistered},
}
Implementation: internal/swarm/agents/recon.go wraps internal/agent/recon/agent.go
Tools used (selected by target type):
| Target type | Tool order |
|---|
| Domain | subfinder → dnsx → naabu → httpx → katana → gau → nuclei |
| URL | httpx → katana → gau → nuclei |
| IP | naabu → httpx → nuclei |
All tools are provided by the ProjectDiscovery Go library stack plus nmap for service/version detection. Every tool invocation is scope-validated before execution.
What it writes:
| Finding type | Pheromone base | Half-life | Source |
|---|
SUBDOMAIN | 0.7 | 2 hours | subfinder / dnsx |
PORT_OPEN | 0.8 | 1 hour | naabu / nmap |
SERVICE | 0.8 | 1 hour | nmap service detection |
HTTP_ENDPOINT | 0.6 | 2 hours | httpx / katana / gau |
HTTP_ENDPOINT_INTERESTING | 0.9 | 2 hours | katana interesting-flag heuristic (written as HTTP_ENDPOINT on the wire) |
TECHNOLOGY | 0.5 | 2 hours | httpx tech fingerprint |
LLM role: After tools complete, the recon agent calls the LLM to analyse raw tool output and structure it into an AttackSurface. This is the only LLM call in the recon path — the individual tool runs are direct subprocess/library calls.
Prompt caching is enabled by default for the Recon agent on Claude. The system prompt and tool definitions are cached, so repeated recon calls within the same campaign share cached tokens and pay significantly reduced latency and cost on subsequent invocations.
3. Classifier Agent
The Classifier agent enriches raw recon findings with CVE mappings, CVSS v3.1 scores, and severity labels. It runs concurrently against multiple findings — up to 3 in parallel by default — and only processes findings whose pheromone weight is still above 0.2, discarding stale data before spending LLM tokens.
Trigger predicate:
blackboard.Predicate{
Types: []blackboard.FindingType{
blackboard.TypeSubdomain,
blackboard.TypePortOpen,
blackboard.TypeService,
blackboard.TypeHTTPEndpoint,
blackboard.TypeTechnology,
},
MinPheromone: 0.2,
}
Implementation: internal/swarm/agents/classifier.go wraps internal/agent/classifier/agent.go
What it does:
- Each finding passes through an
FPFilter that scores its false-positive probability and drops obvious noise before any LLM call is made.
- Remaining findings are batched (groups of 20) and sent to the LLM using a structured
emit_classified_findings tool with a JSON-Schema enum for severity and confidence. This guarantees parseable output on providers that support tool use.
- On providers without tool use (Ollama, LM Studio), the classifier falls back to a JSON-in-prompt path.
- Each classified finding is written to the board with a pheromone and half-life that reflect its severity.
Pheromone-by-severity mapping:
| Severity | Pheromone base | Half-life |
|---|
| Critical | 1.0 | 6 hours |
| High | 0.9 | 3 hours |
| Medium | 0.6 | 1 hour |
| Low | 0.4 | 30 minutes |
| Info / default | 0.2 | 10 minutes |
What it writes:
| Finding type | Condition |
|---|
CVE_MATCH | One or more CVE IDs identified |
MISCONFIGURATION | No CVE, but a security misconfiguration found |
Prompt caching is enabled by default for the Classifier agent on Claude. The structured tool definition for emit_classified_findings is cached, so the per-finding LLM call only pays for the variable portion of the prompt.
4. Exploit Agent
The Exploit agent acts on high-confidence, recent CVE_MATCH findings. The MinPheromone: 0.5 threshold means it only processes findings that are both credible (high severity from the classifier) and recent — decayed old findings are automatically excluded at the board layer, not by the agent.
Trigger predicate:
blackboard.Predicate{
Types: []blackboard.FindingType{blackboard.TypeCVEMatch},
MinPheromone: 0.5, // only act on credible, recent findings
}
Implementation: internal/swarm/agents/exploit.go wraps internal/agent/exploit/agent.go
What it does:
- Decodes the
ClassifiedFinding from the blackboard finding’s Data field.
- Passes it to the exploit agent’s
BuildPlan(), which first applies rule-based chain generation, then sends both the finding and the rule-based chains to the LLM for reasoning and enhancement. The LLM reasons in <think> tags before producing a JSON attack plan.
- The top attack path is always written to the board as
EXPLOIT_CHAIN, making it visible to the report agent regardless of whether execution is attempted.
- If an executor is wired and
--dry-run is not set, each step in the top path is executed via the scope-validated command executor. Each execution result becomes an EXPLOIT_RESULT finding.
What it writes:
| Finding type | Pheromone base | Notes |
|---|
EXPLOIT_CHAIN | 0.9 | Written for every plan, even in dry-run mode |
EXPLOIT_RESULT | 0.7 (success: 1.0) | One per executed step; PheromoneBase overridden to 1.0 on success |
Safety: The executor rejects destructive tokens (rm, kill, chmod, drop, truncate, and others) at parse time when --safe-mode is active. The --assist flag adds a human-in-the-loop confirmation before every executed step.
5. Report Agent
The Report agent fires exactly once per campaign, triggered by the CAMPAIGN_COMPLETE finding that the scheduler writes when the campaign winds down (time budget, SIGINT, or an explicit completion signal).
Trigger predicate:
blackboard.Predicate{
Types: []blackboard.FindingType{blackboard.TypeCampaignComplete},
}
Implementation: internal/swarm/agents/report.go wraps internal/agent/report/agent.go
What it does:
- Queries the board for
CVE_MATCH and MISCONFIGURATION findings above the publishThreshold (default: 0.5). Findings below the threshold — superseded by the ConfirmationAgent, or agent-error noise — are excluded automatically.
- Queries for
EXPLOIT_CHAIN and EXPLOIT_RESULT findings to reconstruct the attack narrative.
- Deduplicates findings and passes the full set to the report agent for LLM-generated writeups (executive summary, per-finding remediation, attack narrative).
- Appends an ROI footer if a metered LLM provider was wired (spend vs. estimated bounty value).
- Renders output to the configured format(s) and writes files to disk.
Output formats:
| Format | File extension | Use case |
|---|
| Markdown | .md | Default; readable in any editor |
| HTML | .html | Browser-ready report |
| JSON | .json | Machine-readable; CI/CD integration |
| SARIF 2.1.0 | .sarif | GitHub Code Scanning, VS Code |
| All | (all four) | Pass --format all |
The publish threshold can be lowered to 0.1 with --publish-unverified, which includes suspected-but-not-reproduced findings with a banner in the report. The default of 0.5 is the “bug bounty mode” — only verified findings that haven’t been superseded by a reproduction check.
Agent Model Configuration
All agents inherit from the orchestrator’s LLM provider configuration by default. The model is config-driven — any Anthropic model ID, OpenAI-compatible endpoint, or Ollama model name is accepted.
# ~/.pentestswarm/config.yaml
orchestrator:
provider: claude # claude | openai | ollama | lmstudio
model: claude-sonnet-4-5 # any valid model ID for the chosen provider
api_key: "" # stored in OS keychain; not written here
agents:
recon:
model: claude-haiku-4-5 # cheaper model for high-volume recon
classifier:
model: claude-sonnet-4-5
exploit:
model: claude-opus-4-5 # strongest model for exploit reasoning
report:
model: claude-sonnet-4-5
Per-agent overrides let you run cheap agents (recon, classifier) on fast/inexpensive models while directing exploit reasoning to the most capable available model. Swap in any Ollama model name or OpenAI-compatible endpoint by changing provider and model — no other config changes needed.
Adding Custom Agents
Any type that implements the swarm.Agent interface can join the swarm. The minimum implementation is three methods plus a handler function:
// From internal/swarm/agent.go
type Agent interface {
Name() string
Trigger() blackboard.Predicate
MaxConcurrency() int
Handle(ctx context.Context, f blackboard.Finding, board blackboard.Board) error
}
For simple one-off agents, the NamedPredicate helper removes the boilerplate:
customAgent := swarm.NamedPredicate{
AgentName: "my-custom-agent",
Pred: blackboard.Predicate{
Types: []blackboard.FindingType{blackboard.TypeHTTPEndpoint},
MinPheromone: 0.3,
},
Parallel: 2,
Fn: func(ctx context.Context, f blackboard.Finding, board blackboard.Board) error {
// React to HTTP_ENDPOINT findings here.
// Write new findings back to the board.
return nil
},
}
sched.Register(customAgent)
The scheduler handles subscription, cursor management, budget gating, rate limiting, and observability automatically. The custom agent only needs to implement the reaction logic.