Documentation Index
Fetch the complete documentation index at: https://mintlify.com/AdithyaaSivamal/Agentic-AFL/llms.txt
Use this file to discover all available pages before exploring further.
ConstraintTag is the structural constraint ontology at the heart of CARM retrieval. Each tag represents a structural pattern observed in a P-Code backward slice — not a specific algorithm. This algorithm-agnostic design means a proprietary checksum and a standard CRC-16 produce the same structural fingerprint, enabling Jaccard-based retrieval across diverse and unknown codebases. The tag set flows through the entire pipeline: constraint_profiler.py produces tags from P-Code analysis, retrieval_carm.py consumes them for Jaccard similarity matching, and llm_client.py reads them to generate targeted strategy hints for Z3 script generation.
Import
Tag Reference
The 16 tags are grouped into five structural categories. Each tag maps to anauto() integer value in the ConstraintTag enum — these integer values are stored as INTEGER[] in PostgreSQL for server-side Jaccard computation.
Comparison Patterns
Matches
reg == CONST equality checks. Indicates magic number comparisons, header field validation, and protocol version gates. This is the most common tag and appears whenever a branch depends on an input value matching a specific constant.Matches
reg < CONST or similar bounds checks. Indicates value range validation, length limit enforcement, and array bound guards. Commonly co-occurs with LENGTH_GATED_ACCESS.Matches dense if/else chains or
BRANCHIND P-Code ops (compiled switch/dispatch tables). Signals that the stall site fans out to many branches, each requiring a distinct concrete value for coverage.Bitwise Structure
Matches
(input & MASK) == EXPECTED patterns — flag checks, protocol field extraction, and status register tests. Signals that Z3 constraints should use BitVec AND operations rather than arithmetic equality.Detected when bitwise ops (
INT_AND, INT_OR, INT_XOR, INT_LEFT, INT_RIGHT, INT_SRIGHT) dominate the body of a loop. Characteristic of CRC computation, hash scrambling, and stream cipher operations. Co-occurs with COUNTED_LOOP or INPUT_DEPENDENT_LOOP.Matches shift-then-mask sequences used to extract sub-fields from a packed word (e.g., extracting a 4-bit opcode from a 32-bit instruction word). Indicates that the Z3 model must treat the input as a structured bit vector.
Arithmetic Structure
Matches modulo or remainder operations (
INT_REM, INT_SREM). Signals checksum validation, hash table indexing, or cyclic counter logic that wraps around a modulus.Indicates that the branch condition depends on integer overflow behavior — either testing carry/overflow flags or relying on unsigned wrap-around semantics. Z3’s
BitVec arithmetic handles this correctly, but the LLM must avoid signed integer models.Matches linear arithmetic constraints of the form
ax + by + c == 0. Enables Z3’s linear arithmetic solver (LIA) as an alternative to pure BitVec, which can be significantly faster for this class.Memory / Data Flow
Matches array or table access indexed by an input-derived value — lookup table CRCs, AES S-boxes, and Huffman decode tables. Signals that fuzzer mutations targeting this stall should enumerate the valid index domain.
Indicates memory access bounded by a length field loaded from the input buffer. The Z3 template must model both the length field constraint and the downstream access, linking them as dependent variables.
Detected when P-Code shows a multi-hop pointer dereference chain:
LOAD → addr → LOAD. Indicates that the constraint depends on a value fetched through one or more pointer indirections, requiring symbolic pointer modeling or function inlining in Z3.Loop Structure
A loop with a constant iteration bound visible in the P-Code slice. Z3 can often unroll these directly. Frequently co-occurs with
BITWISE_LOOP for fixed-round hash functions.A loop whose bound is derived from input bytes (e.g., iterating
length times). Z3 cannot unroll these statically; the LLM must generate a Z3 ForAll or model the loop as a symbolic constraint on the trip count.Complexity Markers
Detected when
CALL or CALLIND P-Code ops appear in the backward slice. The constraint involves the return value or side effects of a callee function, requiring either function inlining or an abstract summary in the Z3 model.Multi-level if/else nesting where the outer branch condition gates access to the inner branch. Signals that the Z3 template must conjoin multiple constraints, and that the LLM should generate a layered
And(...) formula rather than a flat equality check.Jaccard Similarity
CARM retrieval uses Jaccard similarity to measure structural overlap between a new stall site’s tag set and all previously storedVulnerabilitySpecs. Two stall sites are considered structurally similar if:
where A and B are sets of ConstraintTag integer values.
J(A, B) ≥ carm_similarity_threshold (default 0.3) are considered similar enough to share Z3 templates. The computation happens entirely server-side in PostgreSQL via the jaccard_similarity(a INTEGER[], b INTEGER[]) function — Python receives only the pre-ranked top-N rows.
Tag Detection
ConstraintProfiler assigns tags deterministically from P-Code mnemonic patterns observed in the backward slice from the stall site. Detection is stateless and reproducible — the same binary at the same address always produces the same tag set.
Key detection rules from the source:
BITWISE_LOOPis set when P-Code ops from theBITWISE_OPSset —INT_AND,INT_OR,INT_XOR,INT_LEFT,INT_RIGHT,INT_SRIGHT— appear inside a loop body within the slice.CALLEE_DEPENDENCYis set whenCALLorCALLINDP-Code ops are present anywhere in the slice, regardless of loop context.BITMASK_CHECKis set when anINT_ANDfeeds directly into aCBRANCHcomparison, indicating a flag-mask pattern rather than a general bitwise operation.
INTEGER[] in PostgreSQL (each integer is the ConstraintTag.value from the Python enum) to enable server-side set arithmetic via the intarray extension.
The
ConstraintProfiler is designed to over-tag rather than under-tag. A false positive (an extra tag that doesn’t affect the Z3 strategy) is less harmful than a false negative (missing a tag that would have matched a useful template).Other Enums
constants.py also defines the supporting enumerations used throughout the pipeline.
Architecture
Determines BitVec widths in generated Z3 scripts. The ARCH_REGISTER_WIDTH dict maps each value to its natural register width.
| Value | Description |
|---|---|
ARM32 = "arm32" | 32-bit ARM — Cortex-M, Cortex-R microcontrollers |
ARM64 = "arm64" | 64-bit ARM — Cortex-A application processors |
X86 = "x86" | 32-bit x86 |
X86_64 = "x86_64" | 64-bit x86 |
MIPS32 = "mips32" | 32-bit MIPS — common in routers and IoT devices |
PPC32 = "ppc32" | 32-bit PowerPC — some PLCs and embedded control systems |
Z3Verdict
The outcome of executing a Z3Py script in the sandbox.
Satisfiable — Z3 found a concrete model. The
model dict is populated with variable assignments that can be used to generate seed inputs.Unsatisfiable — no solution exists for the constraints as written. This may indicate over-constrained input (legitimate UNSAT) or an incorrect Z3 model (LLM modeling error).
s.check() exceeded z3_timeout_seconds (default 5). Typically caused by non-linear arithmetic, quantifiers, or very deep nested constraints. The self-repair loop will attempt to simplify the model.The Z3Py script failed to parse or compile. The raw Python exception is captured and fed back into the LLM self-repair cycle (LLM-Sym §3.2).
The script executed but raised a Python exception at runtime (e.g.,
AttributeError, TypeError). Triggers the self-repair cycle with the traceback as context.Z3 returned
unknown — typically caused by non-linear arithmetic that exceeds Z3’s decision procedures. The LLM may retry with a linearized approximation.StallSeverity
Priority classification for AFL++ stall sites. The Orchestrator processes stalls in descending severity order.
Coverage is completely blocked — zero new edges observed after N fuzzing cycles. Processed first. Justifies the full CARM + Z3 + self-repair pipeline regardless of compute cost.
Very few new edges discovered, with high cyclomatic complexity at the stall site. High probability that the branch requires a structured input value unreachable by random mutation.
Some coverage is possible via the stall site but the specific branch is consistently missed across many fuzzing cycles. Worth solving but not at the expense of
CRITICAL stalls.A stall exists but alternative code paths provide adequate coverage. Solved opportunistically when higher-priority stalls are exhausted.
System Defaults
These constants are defined inconstants.py to avoid circular imports with config.py. They serve as fallback values when environment variables are not set.
| Constant | Value | Reference |
|---|---|---|
DEFAULT_MAX_REPAIR_ATTEMPTS | 3 | LLM-Sym §3.2 — self-repair cycle attempt cap |
DEFAULT_K_WAY_VOTE_COUNT | 3 | LINC §2 — generate K scripts, pick the SAT result |
DEFAULT_Z3_TIMEOUT_SECONDS | 5 | TDD_v2 §4.3 — strict s.check() timeout |
DEFAULT_MAX_REACT_TURNS | 5 | SAILOR §4 — iterative refinement turn cap |
SAILOR’s original paper uses 60 ReAct turns. Agentic-AFL caps at 5 (
DEFAULT_MAX_REACT_TURNS = 5) to remain compatible with real-time fuzzing cadence, where the agent must resolve a stall and return control to AFL++ within seconds, not minutes.