Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Armur-Ai/Pentest-Swarm-AI/llms.txt

Use this file to discover all available pages before exploring further.

Pentest Swarm AI deploys five specialist agents, each responsible for a distinct phase of a penetration test. Rather than being called in sequence by a central planner, each agent declares a trigger predicate — a set of conditions on the shared blackboard — and is dispatched automatically when matching findings appear. Every agent that interacts with an LLM uses a ReAct (Reason + Act) loop: the model reasons about the current finding, decides which tool to invoke, observes the result, and iterates until it has enough information to write a conclusion back to the blackboard.

Agent Roster

1. Seed Agent

The Seed agent is not a long-running worker — it is a function called once by the engine at campaign start. Without a seed there is nothing on the board, so no trigger predicate ever fires. Implementation: internal/swarm/agents/seed.goagents.Seed() What it writes:
Finding typePheromone baseHalf-life
TARGET_REGISTERED1.024 hours
// Seed writes the initial TARGET_REGISTERED finding that kicks the swarm off.
func Seed(
    ctx context.Context,
    board blackboard.Board,
    campaignID uuid.UUID,
    target, objective string,
    tun *tuning.Settings,
) error {
    data, _ := json.Marshal(map[string]any{
        "target":    target,
        "objective": objective,
    })
    base, half := tun.Lookup(blackboard.TypeTargetRegistered)
    _, err := board.Write(ctx, blackboard.Finding{
        CampaignID:    campaignID,
        AgentName:     "engine",
        Type:          blackboard.TypeTargetRegistered,
        Target:        target,
        Data:          data,
        PheromoneBase: base,
        HalfLifeSec:   half,
    })
    return err
}

2. Recon Agent

The Recon agent is the first specialist woken by the swarm. It triggers on the TARGET_REGISTERED finding written by the seed, runs the full ProjectDiscovery tool stack against the target, and fans out one blackboard finding per discovered asset. Trigger predicate:
blackboard.Predicate{
    Types: []blackboard.FindingType{blackboard.TypeTargetRegistered},
}
Implementation: internal/swarm/agents/recon.go wraps internal/agent/recon/agent.go Tools used (selected by target type):
Target typeTool order
Domainsubfinderdnsxnaabuhttpxkatanagaunuclei
URLhttpxkatanagaunuclei
IPnaabuhttpxnuclei
All tools are provided by the ProjectDiscovery Go library stack plus nmap for service/version detection. Every tool invocation is scope-validated before execution. What it writes:
Finding typePheromone baseHalf-lifeSource
SUBDOMAIN0.72 hourssubfinder / dnsx
PORT_OPEN0.81 hournaabu / nmap
SERVICE0.81 hournmap service detection
HTTP_ENDPOINT0.62 hourshttpx / katana / gau
HTTP_ENDPOINT_INTERESTING0.92 hourskatana interesting-flag heuristic (written as HTTP_ENDPOINT on the wire)
TECHNOLOGY0.52 hourshttpx tech fingerprint
LLM role: After tools complete, the recon agent calls the LLM to analyse raw tool output and structure it into an AttackSurface. This is the only LLM call in the recon path — the individual tool runs are direct subprocess/library calls.
Prompt caching is enabled by default for the Recon agent on Claude. The system prompt and tool definitions are cached, so repeated recon calls within the same campaign share cached tokens and pay significantly reduced latency and cost on subsequent invocations.

3. Classifier Agent

The Classifier agent enriches raw recon findings with CVE mappings, CVSS v3.1 scores, and severity labels. It runs concurrently against multiple findings — up to 3 in parallel by default — and only processes findings whose pheromone weight is still above 0.2, discarding stale data before spending LLM tokens. Trigger predicate:
blackboard.Predicate{
    Types: []blackboard.FindingType{
        blackboard.TypeSubdomain,
        blackboard.TypePortOpen,
        blackboard.TypeService,
        blackboard.TypeHTTPEndpoint,
        blackboard.TypeTechnology,
    },
    MinPheromone: 0.2,
}
Implementation: internal/swarm/agents/classifier.go wraps internal/agent/classifier/agent.go What it does:
  1. Each finding passes through an FPFilter that scores its false-positive probability and drops obvious noise before any LLM call is made.
  2. Remaining findings are batched (groups of 20) and sent to the LLM using a structured emit_classified_findings tool with a JSON-Schema enum for severity and confidence. This guarantees parseable output on providers that support tool use.
  3. On providers without tool use (Ollama, LM Studio), the classifier falls back to a JSON-in-prompt path.
  4. Each classified finding is written to the board with a pheromone and half-life that reflect its severity.
Pheromone-by-severity mapping:
SeverityPheromone baseHalf-life
Critical1.06 hours
High0.93 hours
Medium0.61 hour
Low0.430 minutes
Info / default0.210 minutes
What it writes:
Finding typeCondition
CVE_MATCHOne or more CVE IDs identified
MISCONFIGURATIONNo CVE, but a security misconfiguration found
Prompt caching is enabled by default for the Classifier agent on Claude. The structured tool definition for emit_classified_findings is cached, so the per-finding LLM call only pays for the variable portion of the prompt.

4. Exploit Agent

The Exploit agent acts on high-confidence, recent CVE_MATCH findings. The MinPheromone: 0.5 threshold means it only processes findings that are both credible (high severity from the classifier) and recent — decayed old findings are automatically excluded at the board layer, not by the agent. Trigger predicate:
blackboard.Predicate{
    Types:        []blackboard.FindingType{blackboard.TypeCVEMatch},
    MinPheromone: 0.5, // only act on credible, recent findings
}
Implementation: internal/swarm/agents/exploit.go wraps internal/agent/exploit/agent.go What it does:
  1. Decodes the ClassifiedFinding from the blackboard finding’s Data field.
  2. Passes it to the exploit agent’s BuildPlan(), which first applies rule-based chain generation, then sends both the finding and the rule-based chains to the LLM for reasoning and enhancement. The LLM reasons in <think> tags before producing a JSON attack plan.
  3. The top attack path is always written to the board as EXPLOIT_CHAIN, making it visible to the report agent regardless of whether execution is attempted.
  4. If an executor is wired and --dry-run is not set, each step in the top path is executed via the scope-validated command executor. Each execution result becomes an EXPLOIT_RESULT finding.
What it writes:
Finding typePheromone baseNotes
EXPLOIT_CHAIN0.9Written for every plan, even in dry-run mode
EXPLOIT_RESULT0.7 (success: 1.0)One per executed step; PheromoneBase overridden to 1.0 on success
Safety: The executor rejects destructive tokens (rm, kill, chmod, drop, truncate, and others) at parse time when --safe-mode is active. The --assist flag adds a human-in-the-loop confirmation before every executed step.

5. Report Agent

The Report agent fires exactly once per campaign, triggered by the CAMPAIGN_COMPLETE finding that the scheduler writes when the campaign winds down (time budget, SIGINT, or an explicit completion signal). Trigger predicate:
blackboard.Predicate{
    Types: []blackboard.FindingType{blackboard.TypeCampaignComplete},
}
Implementation: internal/swarm/agents/report.go wraps internal/agent/report/agent.go What it does:
  1. Queries the board for CVE_MATCH and MISCONFIGURATION findings above the publishThreshold (default: 0.5). Findings below the threshold — superseded by the ConfirmationAgent, or agent-error noise — are excluded automatically.
  2. Queries for EXPLOIT_CHAIN and EXPLOIT_RESULT findings to reconstruct the attack narrative.
  3. Deduplicates findings and passes the full set to the report agent for LLM-generated writeups (executive summary, per-finding remediation, attack narrative).
  4. Appends an ROI footer if a metered LLM provider was wired (spend vs. estimated bounty value).
  5. Renders output to the configured format(s) and writes files to disk.
Output formats:
FormatFile extensionUse case
Markdown.mdDefault; readable in any editor
HTML.htmlBrowser-ready report
JSON.jsonMachine-readable; CI/CD integration
SARIF 2.1.0.sarifGitHub Code Scanning, VS Code
All(all four)Pass --format all
The publish threshold can be lowered to 0.1 with --publish-unverified, which includes suspected-but-not-reproduced findings with a banner in the report. The default of 0.5 is the “bug bounty mode” — only verified findings that haven’t been superseded by a reproduction check.

Agent Model Configuration

All agents inherit from the orchestrator’s LLM provider configuration by default. The model is config-driven — any Anthropic model ID, OpenAI-compatible endpoint, or Ollama model name is accepted.
# ~/.pentestswarm/config.yaml
orchestrator:
  provider: claude          # claude | openai | ollama | lmstudio
  model: claude-sonnet-4-5  # any valid model ID for the chosen provider
  api_key: ""               # stored in OS keychain; not written here

agents:
  recon:
    model: claude-haiku-4-5  # cheaper model for high-volume recon
  classifier:
    model: claude-sonnet-4-5
  exploit:
    model: claude-opus-4-5   # strongest model for exploit reasoning
  report:
    model: claude-sonnet-4-5
Per-agent overrides let you run cheap agents (recon, classifier) on fast/inexpensive models while directing exploit reasoning to the most capable available model. Swap in any Ollama model name or OpenAI-compatible endpoint by changing provider and model — no other config changes needed.

Adding Custom Agents

Any type that implements the swarm.Agent interface can join the swarm. The minimum implementation is three methods plus a handler function:
// From internal/swarm/agent.go
type Agent interface {
    Name() string
    Trigger() blackboard.Predicate
    MaxConcurrency() int
    Handle(ctx context.Context, f blackboard.Finding, board blackboard.Board) error
}
For simple one-off agents, the NamedPredicate helper removes the boilerplate:
customAgent := swarm.NamedPredicate{
    AgentName: "my-custom-agent",
    Pred: blackboard.Predicate{
        Types:        []blackboard.FindingType{blackboard.TypeHTTPEndpoint},
        MinPheromone: 0.3,
    },
    Parallel: 2,
    Fn: func(ctx context.Context, f blackboard.Finding, board blackboard.Board) error {
        // React to HTTP_ENDPOINT findings here.
        // Write new findings back to the board.
        return nil
    },
}
sched.Register(customAgent)
The scheduler handles subscription, cursor management, budget gating, rate limiting, and observability automatically. The custom agent only needs to implement the reaction logic.

Build docs developers (and LLMs) love