Agentic-AFL Settings and Configuration Parameters Guide

The config.py module is the single source of truth for every tunable parameter in Agentic-AFL. It exposes one object — the settings singleton — an instance of the AgenticAFLConfig dataclass that components import instead of hardcoding values. When Agentic-AFL starts, config.py first attempts to load a .env file from the project root via python-dotenv. Any key found there fills in os.environ before individual fields are evaluated. An explicit shell export always wins over a .env entry, and .env entries win over the hardcoded defaults defined in constants.py. This three-layer precedence means you can tune a production deployment purely through environment variables with no code changes.

from agentic_afl.config import settings

timeout = settings.z3_timeout_seconds
provider = settings.llm_api_provider

Ghidra / Extractor Settings

These settings control where Ghidra is installed, where it stores project artifacts, and how aggressively the P-Code extractor processes each function slice before handing context off to the LLM.

ghidra_install_dir

Path

default:"/opt/ghidra"

Path to the Ghidra installation directory. The headless analyzer binary (analyzeHeadless) must exist under this path. Overridden by the GHIDRA_INSTALL_DIR environment variable.

ghidra_project_dir

Path

default:"/tmp/ghidra_projects"

Directory where Ghidra stores headless analysis projects and intermediate artifacts. Overridden by the GHIDRA_PROJECT_DIR environment variable.

max_slice_depth

int

default:"20"

Maximum number of basic blocks to traverse during backward slicing from a stall site. Deeper slices give the LLM richer constraint context but increase token cost and extraction time. A value of 20 balances coverage depth with latency.

max_pcode_instructions

int

default:"200"

Maximum number of P-Code instructions retained before the extractor truncates the slice. Tuned to stay within approximately 4K tokens after prompt framing overhead is added. Raise this value when working with architectures that produce verbose P-Code (for example, MIPS32 with delay slots).

Target Architecture

target_architecture

Architecture

default:"Architecture.ARM32"

CPU architecture of the target binary. This value determines BitVec widths in generated Z3 scripts — a 32-bit target uses BitVec(32) registers, a 64-bit target uses BitVec(64). Valid values are ARM32, ARM64, X86, X86_64, MIPS32, and PPC32, all defined in the Architecture enum in constants.py.

The register widths each architecture maps to are fixed in ARCH_REGISTER_WIDTH inside constants.py:

Architecture	Register width
`ARM32`	32 bits
`ARM64`	64 bits
`X86`	32 bits
`X86_64`	64 bits
`MIPS32`	32 bits
`PPC32`	32 bits

LLM / Orchestrator Settings

These settings control which LLM provider and model Agentic-AFL calls, how generated Z3 scripts are sampled and voted on, and how many self-repair or ReAct turns are permitted before the orchestrator defers back to AFL++.

llm_api_provider

str

default:"openai"

LLM backend to use for constraint solving and orchestration. Accepted values are "openai", "gemini", and "local". Overridden by the LLM_API_PROVIDER environment variable.

llm_api_key

str

required

OpenAI API key used when llm_api_provider is "openai". Must be set via the LLM_API_KEY environment variable or a .env file — never hardcoded. An empty string disables OpenAI calls without raising an error at startup.

gemini_api_key

str

Google Gemini API key, consulted when llm_api_provider is "gemini". Kept separate from llm_api_key so users running hybrid experiments can have both keys loaded simultaneously. Overridden by the GEMINI_API_KEY environment variable.

llm_model_name

str

default:"gpt-4.1-2025-04-14"

Exact model identifier passed to the LLM provider’s API. Switch this to gemini-2.0-flash-exp when using the Gemini backend. Overridden by the LLM_MODEL_NAME environment variable.

llm_temperature

float

default:"0.7"

Sampling temperature for Z3 script generation. A moderate value of 0.7 gives enough creative variation for K-way voting to find a SAT candidate without producing incoherent constraint logic. Lowering this toward 0.0 makes outputs more deterministic.

llm_max_output_tokens

int

default:"16384"

Maximum output tokens per LLM call. Gemini thinking-mode models count chain-of-thought tokens against this budget, so the default is set high enough to accommodate both the reasoning trace and the final Z3 script.

k_vote_count

int

default:"3"

Number of Z3 scripts generated in parallel per stall event for K-way voting (LINC §2). The orchestrator picks the first script that returns SAT; if none do, self-repair begins. K=3 is cheap — three parallel API calls — and empirically mitigates the 13–38% syntax error rate observed in LLM-generated constraint code. Overridden by the K_VOTE_COUNT environment variable.

max_repair_attempts

int

default:"3"

Maximum number of self-repair cycles the LLM is allowed per Z3 generation attempt (LLM-Sym §3.2). On each cycle, the Z3 sandbox error message is fed back to the LLM with a repair prompt. If the script is still broken after this many attempts, the stall event is skipped and AFL++ resumes unassisted.

max_react_turns

int

default:"5"

Maximum number of ReAct (Reason + Act) turns the orchestrator executes before abandoning the current stall and deferring back to AFL++ (SAILOR §4). SAILOR’s original implementation allows up to 60 turns; 5 is calibrated for real-time fuzzing campaigns where latency matters more than exhaustive search.

Z3 Sandbox

z3_timeout_seconds

int

default:"5"

Hard timeout for each s.check() call inside the Z3 subprocess sandbox (TDD_v2 §4.3). Prevents path explosion on cryptographic constraint systems — without this limit a single check() on a SHA-256 constraint can run indefinitely. Overridden by the Z3_TIMEOUT_SECONDS environment variable.

z3_sandbox_dir

Path

default:"/tmp/agentic_afl_sandbox"

Temporary directory used by the Z3 sandbox for subprocess script files and result artifacts. Each invocation writes a uniquely named script here and cleans it up after execution. Overridden by the Z3_SANDBOX_DIR environment variable.

AFL++ / Fuzzer Bridge

These settings wire Agentic-AFL to a running AFL++ instance. AFL++ must already be started pointing at the same afl_output_dir; Agentic-AFL watches it for stalls and injects solved payloads back into afl_sync_dir so they are ingested on the next AFL++ cycle.

afl_output_dir

Path

default:"./afl_output"

Root output directory for the AFL++ campaign. Agentic-AFL reads fuzzer_stats and queue files from this directory to detect stall conditions. Overridden by the AFL_OUTPUT_DIR environment variable.

afl_sync_dir

Path

default:"./afl_output/sync_dir"

Directory into which Agentic-AFL drops solved constraint payloads. AFL++ natively ingests any new file placed here on its next cycle. This must be the same sync_dir AFL++ was launched with. Overridden by the AFL_SYNC_DIR environment variable.

min_stall_cycles

int

default:"50"

Minimum number of AFL++ cycles that must elapse without new edge coverage before a stall is declared and the orchestrator is invoked. Setting this too low causes unnecessary LLM calls on healthy fuzz runs; too high and genuine roadblocks are left unaddressed for too long.

stall_poll_interval

float

default:"5.0"

How often (in seconds) the stall detector reads fuzzer_stats to check coverage progress. Lowering this increases monitoring responsiveness at the cost of slightly more I/O against the AFL++ output directory.

min_stall_time_seconds

int

default:"0"

Time-based stall threshold in seconds. When set to a value greater than 0, this overrides cycle-based detection: a stall is declared only after this many seconds have elapsed with no new edges. Useful for long-running continuous campaigns where AFL++ cycles vary wildly in duration. A value of 0 means cycle-based detection (min_stall_cycles) is used instead. Overridden by the MIN_STALL_TIME_SECONDS environment variable.

PostgreSQL (Spec Store / CARM)

Agentic-AFL stores constraint templates in PostgreSQL with a JSONB schema and a custom jaccard_similarity() SQL function. PostgreSQL replaced an earlier mem0 vector-DB backend because CARM’s retrieval algorithm uses Jaccard similarity over tag sets, which is fundamentally different from the cosine similarity that vector databases are optimized for. Computing Jaccard server-side in SQL returns only the top-N qualified matches without pulling all rows into Python.

postgres_dsn

str

PostgreSQL connection string (DSN) for the spec store. Overridden by the POSTGRES_DSN environment variable. The default credentials match the development Docker setup described in the environment variables reference.

carm_similarity_threshold

float

default:"0.3"

Minimum Jaccard similarity score for the CARM retrieval step to consider a stored template a candidate match (ConstraintLLM §2.2). This value is passed directly to the SQL WHERE jaccard_similarity(...) >= ? clause. Lower values return more candidates with potentially noisier matches; higher values return fewer, more precise templates.

carm_max_results

int

default:"5"

Maximum number of constraint templates returned per CARM query, applied as a SQL LIMIT. Increase this when operating on large spec stores to give the LLM more template candidates; decrease it to reduce prompt token cost.

Logging / Debug

log_level

str

default:"INFO"

Python logging level for all Agentic-AFL components. Accepted values are DEBUG, INFO, WARNING, ERROR, and CRITICAL. Set to DEBUG to trace individual P-Code extraction steps and LLM prompts. Overridden by the LOG_LEVEL environment variable.

log_dir

Path

default:"./logs"

Directory where structured log files are written. The directory is created automatically if it does not exist. Overridden by the LOG_DIR environment variable.

debug_mode

bool

default:"false"

When True, Agentic-AFL saves raw LLM completions (full prompt + response) and every generated Z3 script to /tmp/agentic_afl_debug/ for post-mortem analysis. Activating debug mode significantly increases disk usage on high-throughput campaigns. Enabled by setting the DEBUG_MODE environment variable to 1, true, or yes.

Every setting backed by an environment variable can be overridden at runtime without touching a single line of code. Set the variable in your shell or .env file before starting Agentic-AFL and the new value takes effect immediately on the next launch.

Get Started

Configuration

Guides

Architecture

Agentic-AFL Settings and Configuration Parameters Guide

Ghidra / Extractor Settings

Target Architecture

LLM / Orchestrator Settings

Z3 Sandbox

AFL++ / Fuzzer Bridge

PostgreSQL (Spec Store / CARM)

Logging / Debug

Build docs developers (and LLMs) love

Get Started

Configuration

Guides

Architecture

Documentation Index

​Ghidra / Extractor Settings

​Target Architecture

​LLM / Orchestrator Settings

​Z3 Sandbox

​AFL++ / Fuzzer Bridge

​PostgreSQL (Spec Store / CARM)

​Logging / Debug

Build docs developers (and LLMs) love

Ghidra / Extractor Settings

Target Architecture

LLM / Orchestrator Settings

Z3 Sandbox

AFL++ / Fuzzer Bridge

PostgreSQL (Spec Store / CARM)

Logging / Debug