Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/AdithyaaSivamal/Agentic-AFL/llms.txt

Use this file to discover all available pages before exploring further.

Agentic-AFL uses a strict dataclass pipeline. Each stage of the pipeline consumes the output of the previous stage, and no component reaches across stage boundaries. These types are the public data contracts between components — when adding a new field to any dataclass, update both the producer that populates it and the consumer that reads it.

Data Flow

Binary + StallAddr

PCodeSlice         (pcode_slicer.py produces)

ConstraintProfile  (constraint_profiler.py produces)

VulnerabilitySpec  (spec_exporter.py produces — persisted to PostgreSQL)

StallReport        (stall_detector.py produces)

Z3GenerationRequest (llm_client.py consumes)

Z3Script           (llm_client.py produces — K scripts for voting)

Z3Result           (z3_sandbox.py produces)

SolvedPayload      (extracted from Z3Result.model)

sync_dir/          (payload_injector.py writes)

PCodeInstruction

A single Ghidra P-Code operation extracted from a basic block. P-Code is Ghidra’s architecture-neutral intermediate representation that preserves all semantic information from the original machine code. PCodeInstruction is frozen (immutable).
address
str
The original machine code address as a hex string (e.g., "0x08001234").
mnemonic
str
The P-Code operation mnemonic (e.g., "INT_ADD", "CBRANCH", "LOAD").
inputs
list[str]
List of input varnodes as strings (e.g., ["r0", "0x10", "(ram, 0x20001000, 4)"]).
output
str | None
The output varnode, or None for operations like BRANCH and STORE that produce no output varnode.
raw_pcode
str
The full Ghidra P-Code text line, preserved verbatim for debugging and prompt injection.
call_target
str | None
Resolved function name for CALL operations (e.g., "crc16_modbus"). None for non-call operations or unresolved indirect calls.

PCodeSlice

A taint-bounded backward slice of P-Code instructions from a stall site. The slice contains only P-Code instructions that are data-dependent on the fuzzer’s input buffer (taint source). Instructions resolving to global state not connected to the input are pruned. Produced by: extractor/pcode_slicer.py
Consumed by: extractor/constraint_profiler.py, orchestrator/llm_client.py
binary_path
Path
Path to the analyzed binary.
stall_address
str
The address where AFL++ coverage stalled (hex string, e.g., "0x00401a20").
function_name
str
Ghidra’s decompiled function name. May be "FUN_xxxxx" for stripped binaries.
function_entry
str
Entry point address of the function containing the stall site (hex string).
instructions
list[PCodeInstruction]
Ordered list of P-Code instructions in the backward slice.
taint_source
str
Description of the taint origin, e.g., "RDI" (x86_64 first argument register) or "input_buffer @ 0x20001000".
slice_depth
int
Number of basic blocks traversed backward to build this slice.
truncated
bool
True if the slice was truncated (via assuming(0)) to fit within the LLM token budget.
architecture
Architecture
Target CPU architecture enum value. Determines BitVec widths in generated Z3 scripts.
decompiled_c
str
Ghidra’s decompiled C pseudocode for the containing function. Best-effort; may be empty for obfuscated or heavily optimized binaries.

ConstraintProfile

A structural fingerprint of a stall site’s mathematical constraint type. Produced from P-Code analysis; consumed by CARM retrieval for Jaccard similarity matching. This is frozen (immutable) and hashable. Produced by: extractor/constraint_profiler.py
Consumed by: orchestrator/retrieval_carm.py
tags
frozenset[ConstraintTag]
Set of ConstraintTag enum values identifying structural patterns observed in the P-Code slice (e.g., BITWISE_LOOP, CONSTANT_EQUALITY, CALLEE_DEPENDENCY). The full ontology is defined in constants.py.
bitwise_density
float
Ratio of bitwise operations (XOR, AND, OR, shift) to total operations. Range: 0.01.0. High density indicates hash/CRC-style constraints.
arithmetic_density
float
Ratio of arithmetic operations (INT_ADD, INT_MUL, etc.) to total operations. Range: 0.01.0.
loop_depth
int
Maximum nesting depth of loop structures detected in the slice. Used to flag COUNTED_LOOP and INPUT_DEPENDENT_LOOP tags.
register_count
int
Number of distinct symbolic registers (varnodes) referenced in the slice. Higher counts correlate with higher estimated_complexity.
estimated_complexity
int
Heuristic difficulty score from 0 (trivial constant equality) to 100 (deeply nested CRC with callee dependencies). Used to order stall processing and select CARM templates.

VulnerabilitySpec

A self-contained, JSON-serializable specification for a single stall site. This is the primary data artifact stored in PostgreSQL. It bundles the P-Code slice, constraint profile, and all metadata needed for the Orchestrator to generate a Z3 script without re-running the Extractor. Produced by: extractor/spec_exporter.py
Consumed by: database/spec_store.py, orchestrator/retrieval_carm.py
spec_id
str
Unique 16-character identifier derived as the first 16 hex characters of SHA-256(binary_path + stall_address). Deterministic — the same binary and address always produce the same ID.
binary_path
Path
Absolute path to the analyzed binary.
stall_address
str
The stall site address (hex string).
function_name
str
Containing function name from Ghidra.
pcode_slice
PCodeSlice
The full extracted P-Code backward slice.
constraint_profile
ConstraintProfile
The structural constraint fingerprint computed from the slice.
architecture
Architecture
Target CPU architecture.
z3_template_hint
str | None
A previously successful Z3 script for a structurally similar constraint profile, retrieved from CARM. Injected into the LLM prompt as a starting point.
correction_history
list[CorrectionEntry]
Ordered list of past error→correction pairs accumulated across all ReAct turns for this stall. Fed to the LLM as negative examples to prevent repeated mistakes.
created_at
datetime
UTC timestamp when this spec was first created by SpecExporter.
last_attempted
datetime | None
UTC timestamp of the most recent solve attempt, or None if never attempted.
solve_count
int
Number of times a payload has been successfully injected for this spec.

CorrectionEntry

A single error→correction pair from a past Z3 generation attempt. Stored in VulnerabilitySpec.correction_history to give the LLM a record of prior failures (negative examples) when generating repair prompts. CorrectionEntry is frozen.
error_message
str
The error string returned by the Z3 sandbox or the AgentLoop’s incomplete-model checker.
corrected_script
str
The full Z3Py script that was submitted when this error occurred (not necessarily a corrected version — it is the script that produced the error).
timestamp
datetime
UTC timestamp when this correction entry was created.

StallReport

A report from the stall detector indicating a coverage plateau at a specific address. Placed on the AgentLoop’s priority queue for processing. Produced by: fuzzer_bridge/stall_detector.py
Consumed by: orchestrator/agent_loop.py
stall_address
str
The address where coverage stalled (hex string).
binary_path
Path
Path to the binary being fuzzed.
severity
StallSeverity
Priority classification: CRITICAL, HIGH, MEDIUM, or LOW. Determines queue ordering — CRITICAL stalls are dequeued first.
cycles_stalled
int
Number of AFL++ cycles that have passed with no new edges at this address.
seed_input
bytes
Raw bytes of the AFL++ queue entry that most recently reached this address. Used as the concrete input context for the LLM prompt.
seed_input_path
Path
Filesystem path to the seed file in AFL++‘s queue/ directory.
coverage_bitmap
bytes | None
Snapshot of AFL++‘s coverage bitmap for diffing. None if not captured.
detected_at
datetime
UTC timestamp when the stall was first detected.

Z3GenerationRequest

A request to the LLM to generate Z3 scripts for a specific stall. Bundles all context the LLM needs: the P-Code slice (via VulnerabilitySpec), seed input, retrieved templates, correction history, and GDB runtime state. Produced by: orchestrator/agent_loop.py
Consumed by: orchestrator/llm_client.py
vuln_spec
VulnerabilitySpec
The full vulnerability specification for the stall site, including P-Code slice and constraint profile.
seed_input
bytes
The closest seed input from the AFL++ queue. Provides concrete byte context and determines the input length used for byte-variable counting.
retrieved_templates
list[str]
Previously successful Z3 scripts retrieved from CARM for structurally similar stalls. Injected into the LLM prompt as positive examples.
correction_history
list[CorrectionEntry]
Error→correction pairs from prior ReAct turns. Grows by one entry per failed turn.
k_vote_count
int
Number of parallel Z3 script candidates to generate (K-way voting). Defaults to settings.k_vote_count.
base_offset
int
File byte offset where the function’s input pointer begins. Discovered by the REDQUEEN-style offset probe. When > 0, the LLM is told that input[0] maps to byte_{base_offset} in the full file.
runtime_state
dict[str, str]
GDB-captured memory/register values at function entry, keyed by names such as "rdi_ptr", "rdi_hex", "rsi_value". Used by the LLM to determine concrete struct field values invisible to static analysis.

Z3Script

A Z3Py script generated by the LLM. One Z3Script is produced per voting candidate per ReAct turn. Produced by: orchestrator/llm_client.py
Consumed by: orchestrator/z3_sandbox.py
script_text
str
The full Z3Py Python code, sanitized and ready for sandbox execution. The sandbox strips duplicate from z3 import *, s = Solver(), and s.check() calls before wrapping.
generation_idx
int
Which of the K voting candidates this script is (zero-indexed, 0 to k_vote_count - 1).
attempt_number
int
Which ReAct turn produced this script (one-indexed). 1 for the initial generation; higher for repairs.
prompt_tokens
int
Token count of the prompt submitted to the LLM. Used for API cost tracking.
completion_tokens
int
Token count of the LLM’s response. Used for API cost tracking.
model_name
str
The LLM model identifier that produced this script (e.g., "gpt-4.1", "gemini-2.0-flash").

Z3Result

The result of executing a Z3Script in the sandbox. Carries the solver verdict, the concrete variable model (if satisfiable), error details (if not), and timing. Produced by: orchestrator/z3_sandbox.py
Consumed by: orchestrator/agent_loop.py
verdict
Z3Verdict
One of SAT, UNSAT, TIMEOUT, SYNTAX_ERROR, RUNTIME_ERROR, or UNKNOWN. See constants.Z3Verdict for semantics.
model
dict[str, int] | None
When verdict == Z3Verdict.SAT, a dict mapping Z3 variable names to concrete integer values. Variable names follow the byte_N convention so AgentLoop._model_to_payload() can reconstruct the input buffer. None for all non-SAT verdicts.
error_message
str | None
Error string when verdict is not SAT. This string is fed back to the LLM in repair prompts. None when verdict is SAT.
execution_time
float
Wall-clock seconds the sandbox subprocess ran (including subprocess startup overhead).
script
Z3Script
The Z3Script that produced this result. Used for pairing errors with the scripts that caused them during the repair selection step.

SolvedPayload

A concrete byte-array payload extracted from a SAT model and ready for injection into AFL++‘s sync directory. Produced by: orchestrator/agent_loop.py (via _model_to_payload())
Consumed by: fuzzer_bridge/payload_injector.py
raw_bytes
bytes
The payload bytes to write to the sync directory. Constructed by overlaying Z3-solved byte_N values onto the original seed input, preserving seed bytes at positions not covered by the model.
source_spec_id
str
The VulnerabilitySpec.spec_id that this payload solves. Used by the CARM retriever to update the winning template.
stall_address
str
The stall address this payload is designed to bypass (hex string).
z3_model
dict[str, int]
The raw Z3 model dict, preserved for audit logging and harvest-mode verification.
confidence
float
Score from 0.0 to 1.0 representing solve confidence. Currently 1.0 for all single-SAT accepts; future K-way agreement scoring will modulate this.

Build docs developers (and LLMs) love