Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/deeplethe/forkd/llms.txt

Use this file to discover all available pages before exploring further.

The langgraph-react recipe is the canonical forkd “fork a thinking agent” demo. A ReAct agent runs a trip-planning task, builds up reasoning state across multiple steps — tool calls, conversation history, partial answers — and then gets BRANCHed while it’s still mid-thought. Three grandchildren are spawned from the branch, each injected with a different steering hint ("be thorough", "be minimal", "optimize for cost"). All three inherit the parent’s prior reasoning state identically; only the next LLM call diverges. The result: three independent itineraries that differ in ways the model was never explicitly told to produce.

Prerequisites

  • forkd installed and forkd doctor passing all checks
  • The langgraph parent snapshot built or pulled (see below)
  • An LLM API key — the demo defaults to SiliconFlow (OpenAI-compatible, hosts DeepSeek-V3 / Qwen). Set SILICONFLOW_API_KEY, or point LLM_BASE_URL + LLM_API_KEY at any OpenAI-compatible endpoint.

Build the snapshot

1

Build the parent rootfs

cd recipes/langgraph-react
sudo SILICONFLOW_API_KEY=$SILICONFLOW_API_KEY bash build.sh
This builds a python:3.12-slim image with langgraph, langchain-openai, and requests installed. Allow ~5 minutes the first time — the pip wheels are heavy.
2

Register the snapshot with the daemon

curl -fsS -H "Authorization: Bearer $FORKD_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"tag":"langgraph","kernel":"/path/to/vmlinux","rootfs":"/path/to/recipes/langgraph-react/parent.ext4","rw":true,"tap":"forkd-tap0","boot_wait_secs":20}' \
  $FORKD_URL/v1/snapshots
Alternatively, pull the pre-built snapshot from the Hub (skips the build entirely):
forkd pull deeplethe/langgraph-react

Run the demo

export SILICONFLOW_API_KEY=...
export FORKD_URL=http://127.0.0.1:8889
export FORKD_TOKEN=$(cat /etc/forkd/token)
bash recipes/langgraph-react/demo.sh
The demo writes all artifacts to results/<timestamp>/ in the current directory.

What happens

1

Source agent runs a ReAct loop

The agent receives the task: “Plan a 2-day trip to Kyoto and Osaka. Use the tools to check weather and find places.” It runs weather and search_places tool calls across multiple steps, building up a conversation history and partial answer.
2

Agent emits READY_TO_BRANCH and pauses

After the configured number of steps (--branch-after-step=3 by default), the agent emits a {"event":"ready_to_branch"} JSONL marker and sleeps for --branch-wait-s seconds (default 30). This is the window the orchestrator uses.
3

Orchestrator BRANCHes and spawns three grandchildren

demo.sh polls for the READY_TO_BRANCH marker, then calls POST /v1/sandboxes/:id/branch on the daemon. Three grandchildren are spawned from the resulting snapshot. The orchestrator writes a different hint file into each child’s /tmp/forkd-hint.txt via exec.
4

Each grandchild reads its hint and completes the loop

When the agent wakes from its branch_wait_s sleep (whether in the source or a grandchild), it reads /tmp/forkd-hint.txt before the next LLM call. The hint is prepended to the conversation as a system-level steering message. The three grandchildren make different choices from the same prior state.

Results

The full run summary from results-2026-05-18/ shows clear itinerary divergence between all four agents:
AgentHintDay-1 afternoon (Kyoto)Notable framing
parent(none — control)Nishiki Market ($$)Baseline; no special framing
thorough"cultural depth, slow"Arashiyama Bamboo Grove (free)Replaced shopping with cultural-nature activity
minimal"daylight outside, no shopping"Arashiyama Bamboo Grove (free)Replaced shopping with outdoor activity
cost"avoid $$$, prefer free or $"Arashiyama Bamboo Grove (free)Added "may be pricey" warning labels; explicit cost-optimization footer
The model was never told to drop Nishiki Market or add Arashiyama. All three hinted children independently re-ranked based on their hint. The cost-focused child went further, annotating dining stops with budget warnings that the others didn’t include. Timing from the real run:
MetricValue
Daemon-measured pause window4 007 ms (SATA SSD) / 163 ms (tmpfs)
Memory image size513 MiB
Grandchildren spawned3
Network retries0 (clean run)
Per-agent token cost1 395–1 546 tokens

Key code: the hint side-channel in agent.py

The agent reads /tmp/forkd-hint.txt before every LLM call. If a hint is present, it is appended as a system message at the end of the conversation — recent steering wins over the earlier system prompt. The prior conversation history and tool results are not modified.
def read_hint() -> str:
    """Read /tmp/forkd-hint.txt. Empty string on any failure."""
    try:
        return HINT_FILE.read_text(encoding="utf-8", errors="replace").strip()
    except (FileNotFoundError, OSError):
        return ""


def run_step(
    *,
    step: int,
    messages: list,
    base_url: str,
    api_key: str,
    model: str,
    temperature: float,
) -> tuple[bool, int]:
    """One ReAct step. Returns (done, tokens_used)."""
    hint = read_hint()
    if hint:
        messages = messages + [
            {"role": "system", "content": f"Updated steering hint: {hint}"}
        ]
        emit({"event": "hint", "step": step, "hint": hint})

    resp = chat_completion(
        base_url=base_url,
        api_key=api_key,
        model=model,
        messages=messages,
        tools=TOOLS_SPEC,
        temperature=temperature,
    )
    # ... tool dispatch, answer detection, etc.
And the branch-point pause that gives the orchestrator time to act:
if step == args.branch_after_step:
    emit({"event": "ready_to_branch"})
    # Orchestrator uses this window to BRANCH + spawn grandchildren
    # + plant hints via `forkd-controller exec`.
    time.sleep(args.branch_wait_s)
    emit({"event": "resumed"})

Why BRANCH instead of calling the LLM three times in parallel?

When you call the LLM three times in parallel with different system prompts, each call starts from scratch — there is no shared prior state. The agent re-does its tool calls, re-spends tokens, and potentially reaches different intermediate conclusions before you inject the hint. With BRANCH, the three grandchildren inherit:
  • The conversation history built up by the parent (tool calls, tool results, prior reasoning)
  • The Python heap — loaded packages, the in-memory message list, any caches
  • The in-guest filesystem state — any files written to /tmp
The hint perturbation applies only to the next LLM call. Everything before the branch point is shared and identical. This is a strict superset of what parallel API calls can do, and it’s cheaper in tokens and wall-clock time.
See /guides/branching for a full explanation of BRANCH mode, the diff vs. live snapshot options, and how to tune the pause window for your workload.

Demo artifacts

After demo.sh completes, results/<timestamp>/ contains:
FileContents
source-parent-transcript.jsonlSource agent’s full step history (JSONL, one event per line)
child-thorough-transcript.jsonlThorough child’s history after the divergence
child-minimal-transcript.jsonlMinimal child’s history after the divergence
child-cost-transcript.jsonlCost-focused child’s history after the divergence
summary.mdAuto-generated side-by-side comparison of all four final answers
summary.jsonMachine-readable version of the summary
branch.jsonDaemon’s BRANCH response including pause_ms

Build docs developers (and LLMs) love