Building Your CARM Template Corpus with Harvest Mode

Harvest mode is a special operating mode where CARM retrieval is completely disabled, forcing the agent into zero-shot Z3 generation on every stall. When a generated Z3 script returns SAT and a payload is injected, Agentic-AFL waits up to 90 seconds to measure the AFL++ edge delta. If AFL++ discovers new edges as a direct result of the injection — confirming the payload genuinely bypassed the math wall in the real binary — the (pcode_slice, constraint_tags, z3_script) triplet is automatically committed to the PostgreSQL CARM corpus. Each verified solve becomes a reusable template for future campaigns against similar binaries.

Why Harvest Mode?

CARM (Constraint-Aware Retrieval Memory) retrieval only works when the corpus already contains relevant templates. On a fresh installation, or when targeting a new protocol family, the corpus is empty and retrieval returns nothing. This is the bootstrapping problem: you need solved examples to generate good solutions, but you need good solutions to populate the examples. Harvest mode breaks this cycle by treating the fuzzer itself as the ground-truth oracle:

Zero-shot generation

With CARM disabled, the LLM receives only the extracted P-Code slice and constraint profile — no retrieved hints. Any SAT result is genuinely novel and not influenced by existing corpus entries.

Edge-coverage verification

AFL++ edge delta is the verification oracle. A Z3 script can return SAT but produce bytes that happen to satisfy the symbolic model while failing the actual binary’s runtime checks. New edges are proof the payload worked in the real binary.

How Harvest Mode Works

AgentLoop initializes with harvest_mode=True

CARM retrieval is disabled. When _process_stall reaches the retrieval step, it sets retrieval_results = [] and templates = [] unconditionally, logging "CARM: DISABLED (harvest_mode=True, zero-shot generation)". The rest of the pipeline is unchanged.

Stall triggers zero-shot Z3 generation

The LLM receives the P-Code slice, constraint profile, and correction history (if any ReAct turns have occurred), but no retrieved template hints. The K-way voting and ReAct self-correction loop run exactly as in normal mode.

SAT payload injected; pre-injection edge count recorded

Before writing the payload to AFL++‘s sync directory, Agentic-AFL reads the current edges_found from fuzzer_stats. This establishes the baseline for the verification window.

Edge delta measured over 90-second window

After injection, wait_for_edge_delta polls fuzzer_stats every 2 seconds for up to 90 seconds. The first poll that shows current_edges > baseline_edges returns the delta immediately — there is no need to wait the full 90 seconds if AFL++ responds quickly.

Verified solves committed to PostgreSQL

If edge_delta > 0, the solve is verified. commit_verified_template assembles the triplet — raw P-Code text, integer-encoded constraint tags, and the winning Z3 script — and calls SpecStore.save_spec() followed by store.update_template(). The spec_id is a deterministic SHA-256 hash of the P-Code content (harvest_<12-char-hex>), ensuring idempotent re-harvesting of the same stall site.

Enabling Harvest Mode

Harvest mode is enabled programmatically by passing harvest_mode=True to AgentLoop. The CampaignRunner does not currently expose a --harvest-mode CLI flag — it is designed for use in pre-production corpus-building scripts:

import asyncio
from pathlib import Path
from agentic_afl.orchestrator.agent_loop import AgentLoop

async def main():
    agent = AgentLoop(
        afl_output_dir=Path("./afl_output"),
        afl_sync_dir=Path("./afl_output/agentic/queue"),
        target_binary=Path("./harness"),
        harvest_mode=True,
    )
    await agent.setup()
    await agent.run()

asyncio.run(main())

You are responsible for starting AFL++ separately before calling agent.run(). A typical harvest script launches AFL++ with asyncio.create_subprocess_exec("afl-fuzz", ...) and then starts the agent loop as a concurrent task:

import asyncio
from pathlib import Path
from agentic_afl.orchestrator.agent_loop import AgentLoop

async def main():
    # Start AFL++ first
    afl_proc = await asyncio.create_subprocess_exec(
        "afl-fuzz", "-M", "default",
        "-F", "./afl_output/agentic/queue",
        "-i", "./seeds",
        "-o", "./afl_output",
        "--", "./harness", "@@",
        env={
            **__import__("os").environ,
            "AFL_NO_UI": "1",
            "AFL_SKIP_CPUFREQ": "1",
        },
        stdout=asyncio.subprocess.DEVNULL,
        stderr=asyncio.subprocess.DEVNULL,
    )

    # Give AFL++ a few seconds to initialize
    await asyncio.sleep(3)

    # Start the agent in harvest mode
    agent = AgentLoop(
        afl_output_dir=Path("./afl_output"),
        afl_sync_dir=Path("./afl_output/agentic/queue"),
        target_binary=Path("./harness"),
        harvest_mode=True,
    )
    await agent.setup()

    try:
        await agent.run()
    finally:
        afl_proc.terminate()
        await afl_proc.wait()

asyncio.run(main())

Harvest Metrics

AgentLoop tracks harvest-specific counters in its internal _metrics dictionary alongside the standard campaign metrics. Access them at any time via agent.get_metrics(), which also includes queue_size and active_stalls:

Metric key	Description
`stalls_detected`	Number of stall events queued during this harvest run.
`stalls_processed`	Number of stalls that completed the full pipeline (including failed solves).
`harvest_verified`	SAT solves confirmed by a positive AFL++ edge delta (`edge_delta > 0`).
`harvest_committed`	Templates successfully written to the PostgreSQL CARM corpus. Equals `harvest_verified` in a healthy run.

The HarvestResult dataclass — returned when you use the HarvestDaemon — provides per-binary aggregate fields:

Field	Description
`stalls_detected`	Stall events queued for this binary.
`solves_attempted`	Total solve attempts across all stalls.
`solves_sat`	Z3 scripts that returned SAT.
`solves_verified`	SAT solves confirmed by positive edge delta.
`templates_committed`	Triplets successfully written to PostgreSQL.
`verification_rate`	`solves_verified / solves_sat`. Values above 0.6 indicate high-quality zero-shot generation for this target.

The verification step prevents false positives from polluting the corpus. A Z3 script can return SAT but produce a semantically wrong payload — for example, one that satisfies the symbolic CRC equation but uses the wrong byte ordering for the target architecture. The AFL++ edge-coverage oracle catches this: if no new edges appear within 90 seconds of injection, the solve is discarded and nothing is committed to PostgreSQL.

Run harvest mode on a variety of ICS protocol implementations before deploying Agentic-AFL in production. Even a small corpus of 20–30 verified templates covering common constraint patterns (CRC-32, modular arithmetic, byte-swap) significantly improves the LLM’s first-turn SAT rate during normal campaigns, reducing the ReAct turn budget consumed per stall.

Get Started

Configuration

Guides

Architecture

Building Your CARM Template Corpus with Harvest Mode

Why Harvest Mode?

Zero-shot generation

Edge-coverage verification

How Harvest Mode Works

Enabling Harvest Mode

Harvest Metrics

Build docs developers (and LLMs) love

Get Started

Configuration

Guides

Architecture

Documentation Index

​Why Harvest Mode?

Zero-shot generation

Edge-coverage verification

​How Harvest Mode Works

​Enabling Harvest Mode

​Harvest Metrics

Build docs developers (and LLMs) love

Why Harvest Mode?

How Harvest Mode Works

Enabling Harvest Mode

Harvest Metrics