Documentation Index
Fetch the complete documentation index at: https://mintlify.com/AdithyaaSivamal/Agentic-AFL/llms.txt
Use this file to discover all available pages before exploring further.
Z3Sandbox executes LLM-generated Z3Py scripts in an isolated subprocess with strict timeouts and resource limits. LLM-generated code is untrusted; the sandbox enforces both time and memory budgets and intercepts dangerous imports before any user code runs. All K voting candidates from LLMClient are executed in parallel via asyncio.gather, making the sandbox the primary throughput bottleneck of the pipeline.
Import
Security Model
Every script execution goes through the following protection layers in order:-
Subprocess isolation — Scripts run in a separate Python subprocess (
asyncio.create_subprocess_exec). A crash, infinite loop, orsys.exit()inside the script cannot affect the main Agentic-AFL process. -
RLIMIT_AS(512 MB memory cap) — Set viaresource.setrlimit(resource.RLIMIT_AS, ...)inside the subprocess before any user code executes. A Z3 path explosion that allocates more than 512 MB kills the subprocess withMemoryError. -
RLIMIT_CPU(CPU time limit) — The subprocess’s CPU time is bounded. This acts as a backstop if the asyncio timeout fires before the OS scheduler has a chance to kill the process. -
Import whitelist — A
_restricted_importhook is installed on__builtins__.__import__afterz3has finished loading (Z3’s C extension loadsos,ctypes, etc. internally at import time, so the hook is installed post-import to avoid breaking Z3 itself). Blocked modules includeos,subprocess,socket,http,urllib,requests,threading, andmultiprocessing. -
Strict solver timeout — The execution harness passes
s.set("timeout", <ms>)to the Z3Solverinstance before any LLM-generated constraints are added. This boundss.check()toz3_timeout_seconds(default 5 s).
Constructor
Z3Sandbox is constructed with no required arguments. All parameters have sensible defaults sourced from settings.
Wall-clock seconds before the subprocess is killed with
asyncio.TimeoutError. An additional 5-second overhead buffer is added for subprocess startup, so the actual asyncio.wait_for timeout is timeout_seconds + 5.Directory where temporary script files are written before execution. Each script is given a random UUID-based filename (
z3_exec_<uuid8>.py) and deleted after execution unless keep_scripts=True.When
True, temporary script files are not deleted after execution. Useful for post-mortem debugging — combine with settings.debug_mode which also writes raw LLM completions and sandbox results to /tmp/agentic_afl_debug/.async execute(script: Z3Script) -> Z3Result
Execute a Z3 script in an isolated subprocess. This is the only public method of Z3Sandbox.
Steps performed:
- Wrap the raw LLM script in the execution harness template (strip duplicate
from z3 import *,s = Solver(),s.check(), and model extraction boilerplate that the harness already provides) - Write the wrapped script to a temporary file in
sandbox_dir - Execute the file with
asyncio.create_subprocess_exec(sys.executable, script_path) - Wait up to
timeout_seconds + 5seconds for the subprocess to finish - Parse the JSON printed between
===Z3_RESULT_JSON===and===Z3_RESULT_END===delimiters - Construct and return a
Z3Result - Delete the temporary file (unless
keep_scripts=True)
SYNTAX_ERROR. If the subprocess exits without the delimiters but with exit code 0, the result is classified as RUNTIME_ERROR.
Z3Verdict values
| Verdict | Enum value | Meaning |
|---|---|---|
SAT | "sat" | Z3 found a satisfying assignment — model dict is populated |
UNSAT | "unsat" | Z3 proved no solution exists — constraints are structurally wrong |
TIMEOUT | "timeout" | s.check() exceeded timeout_seconds — constraints may need concretization |
SYNTAX_ERROR | "syntax_error" | Script failed to parse/compile before _execute() ran |
RUNTIME_ERROR | "runtime_error" | Script executed but raised a Python exception inside _execute() |
UNKNOWN | "unknown" | Z3 returned unknown (e.g., non-linear arithmetic beyond Z3’s decision procedures) |
Z3 Model Format
Whenverdict is SAT, the model field is a dict[str, int] mapping Z3 variable names to concrete integer values. The naming convention enforced by LLMClient’s system prompt ensures variables are named byte_0, byte_1, etc., one per input byte position.
AgentLoop._model_to_payload() reads this dict and overlays the solved values onto the original seed input. Positions not covered by the model retain their original seed bytes, allowing the solver to address only the constrained header while leaving the packet body intact.
Scripts that print nothing to stdout (due to early exit, uncaught exception before the
_execute() wrapper, or a bare sys.exit()) are classified as RUNTIME_ERROR. The error message in Z3Result.error_message is capped at 2000 characters and is fed back to the LLM as the correction prompt in the next ReAct turn.Parallel Execution
The sandbox is designed for parallel execution.AgentLoop runs all K Z3Sandbox.execute() calls concurrently with asyncio.gather, so total latency per ReAct turn is approximately the latency of the slowest of the K scripts rather than K times the average latency.
execute() call writes its own uniquely named temp file, so concurrent executions never collide on the filesystem. The subprocess isolation also means that a timeout or crash in one candidate does not affect the others.