Agentic-AFL uses PostgreSQL as its persistent store forDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/AdithyaaSivamal/Agentic-AFL/llms.txt
Use this file to discover all available pages before exploring further.
VulnerabilitySpec objects and Z3 templates. The schema is optimized for Jaccard similarity queries on INTEGER[] constraint tag arrays using the intarray extension and a custom jaccard_similarity() SQL function. All similarity computation happens server-side — Python receives only the pre-ranked top-N results, never the full table.
Why PostgreSQL
The choice of PostgreSQL over a vector database (such as mem0, Qdrant, or Chroma) is a deliberate architectural decision documented inspec_store.py.
CARM retrieval requires Jaccard similarity on formal logic tag sets — a set-intersection-over-union operation on small integer arrays. This is fundamentally different from cosine similarity on high-dimensional embedding vectors, which is what vector databases are optimized for.
| Approach | Jaccard computation | Memory usage | Query complexity |
|---|---|---|---|
| mem0 / vector-DB | Pull ALL records to Python RAM, loop in Python | O(N) RAM | O(N) |
| PostgreSQL + intarray | jaccard_similarity() runs in the database engine | Constant in Python | O(log N) with GIN index |
pgvector extension can be added to the same PostgreSQL instance to enable a two-stage retrieval strategy (ConstraintLLM §2.2) — embedding pre-filter to narrow candidates, followed by Jaccard re-ranking — all in a single database query without introducing a separate service.
The
intarray extension is required for Jaccard similarity queries. It ships with standard PostgreSQL distributions. Run CREATE EXTENSION IF NOT EXISTS intarray; if it is not already enabled on your instance.Schema
The full initialization script is mounted into the PostgreSQL container viadocker-compose.yml and runs automatically on first startup. It is idempotent — safe to run against an existing database.
init.sql
jaccard_similarity(a INTEGER[], b INTEGER[])
The core function that justifies the PostgreSQL architecture. It implements ConstraintLLM Eq. 3 directly in PL/pgSQL:
UNNEST(a) INTERSECT UNNEST(b)— set intersection of the two tag arraysUNNEST(a) UNION UNNEST(b)— set union of the two tag arrays- Returns
0.0when both arrays are empty (avoids division by zero) - Declared
IMMUTABLE— PostgreSQL can inline and cache calls within a single query
Indexes
Three indexes are created to accelerate the common query patterns:| Index | Type | Purpose |
|---|---|---|
idx_specs_architecture | B-Tree | Equality filter on the architecture column; eliminates cross-arch template suggestions |
idx_specs_stall_address | B-Tree | Direct lookup by hex stall address for SpecStore.load_spec() |
idx_specs_tags | GIN | Fast set membership and containment queries on the constraint_tags INTEGER[] column; assists the jaccard_similarity() function |
Column Reference
SHA-256 hash (truncated to 16 hex characters) of
binary_path + stall_address. Uniquely identifies a stall site within a specific binary. Used by SpecStore.load_spec() and all update operations.Absolute filesystem path to the analyzed binary (e.g.,
/targets/firmware.elf). Preserved for provenance — the orchestrator can re-analyze the binary if the spec needs updating.Hex string address of the AFL++ stall site (e.g.,
"0x08001234"). Stored as text to preserve leading zeros and architecture-specific formatting. Indexed for direct lookup.Ghidra decompiled function name containing the stall site (e.g.,
"validate_header", "FUN_08001200"). Used by the LLM prompt to provide human-readable context when no decompiled name is available.Architecture enum value for the target binary (e.g., "arm32", "x86_64"). Used as a mandatory filter in CARM queries to prevent cross-architecture template suggestions — ARM32 templates use BitVec(32) and are incompatible with X86_64 targets.Sorted array of
ConstraintTag enum integer values derived from P-Code analysis of the stall site’s backward slice. This is the primary input to the server-side jaccard_similarity() function. Empty array '{}' means no structural tags were detected. The GIN index (idx_specs_tags) accelerates set operations on this column.Concatenated P-Code instruction text from the backward slice rooted at the stall site branch. Included verbatim in the LLM prompt so the model can reason about the exact operations performed on the input before the branch.
Full
ConstraintProfile metrics serialized as a JSON object. Contains the following fields populated by constraint_profiler.py:bitwise_density— Fraction of slice ops that are bitwise operations (float, 0.0–1.0)arithmetic_density— Fraction of slice ops that are arithmetic operations (float, 0.0–1.0)loop_depth— Maximum loop nesting depth observed in the slice (integer)register_count— Number of distinct registers read in the slice (integer)estimated_complexity— Heuristic complexity score (0–100) used for Stage 2 re-ranking inCARMRetriever._compute_relevance()
The last successful Z3Py script for this stall site — i.e., the script that produced a
SAT verdict and a valid concrete model. NULL for stall sites that have never been solved. Retrieved by CARM and passed to the LLM as a starting-point template for similar stall sites.JSON array of
{error_message, corrected_script, timestamp} objects accumulated across all past self-repair attempts. New entries are appended (not overwritten) via PostgreSQL’s || JSONB concatenation operator, preserving the full repair history. Used to provide the LLM with prior failure context during the self-repair cycle (SAILOR §4.4).Timestamp (with timezone) when this spec was first inserted. Set server-side to
NOW() on insert; not updated on subsequent upserts.Timestamp of the most recent agent attempt to solve this spec — updated on every upsert and on
update_template() and append_correction() calls. NULL for specs that have been stored but not yet attempted. Useful for prioritizing stall sites that have not been re-attempted recently.Number of times this spec has been successfully solved (SAT verdict with a valid concrete model). Incremented by
SpecStore.update_template(). Used as a quality signal in Stage 2 re-ranking — specs with solve_count > 0 are preferred as template sources because their Z3 scripts have been empirically validated.CARM Query Pattern
CARMRetriever calls SpecStore.query_by_jaccard(), which executes the following parameterized query. The jaccard_similarity() call and the ORDER BY happen entirely inside PostgreSQL.
| Parameter | Type | Description |
|---|---|---|
$1 | INTEGER[] | Query tag array for the current stall site (ConstraintTag values) |
$2 | TEXT | Architecture filter (e.g., "arm32") |
$3 | FLOAT | Minimum Jaccard threshold (default 0.3 from carm_similarity_threshold) |
$4 | INTEGER | Maximum rows to return (default 5 from carm_max_results) |
$4 rows, CARMRetriever._compute_relevance() applies a Python-side re-ranking combining Jaccard score (weight 0.6), complexity similarity (weight 0.2), and solve history (weight 0.2) to produce the final relevance_score ordering.
Deployment Options
Local Docker (development)
The recommended way to run the database locally. Theinit.sql file is mounted as a Docker entrypoint initialization script and runs automatically on first startup.
DATABASE_URL for config.py is:
Managed PostgreSQL (production)
Any PostgreSQL 14+ instance with theintarray extension available is supported. Tested providers include:
- Supabase —
intarrayavailable; connection pooling via PgBouncer recommended for the asyncpg pool - Neon — Serverless PostgreSQL;
intarrayavailable; branch-per-environment workflow fits the fuzzing campaign model - Self-hosted Proxmox LXC/VM — Provisioned via Terraform as documented in
infra/
SpecStore Initialization
SpecStore.initialize() is called once at agent startup (agent_loop.setup()). It creates the connection pool, registers the jaccard_similarity() function, and creates the table and indexes — all idempotently via CREATE ... IF NOT EXISTS.
min_size=2, max_size=10, suitable for a single-agent fuzzing process. Increase max_size for multi-agent parallel deployments.