Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/dominikKos9/AgentForge/llms.txt

Use this file to discover all available pages before exploring further.

The orchestrator agent is the entry point for every pipeline run. It receives the raw image_path and session_id from the graph state, runs two utility tools in sequence — image validation and hash generation — then consults the in-memory session cache to decide whether the pipeline should proceed to visual analysis or return a previously computed result immediately.
from backend.tools.image_validator import validate_image
from backend.tools.hash_tool import generate_hash
from backend.memory.session_memory import SessionMemory


memory_store = SessionMemory()


def orchestrator_agent(state):

    path = state["image_path"]
    session_id = state["session_id"]

    memory = memory_store.get(session_id)

    # validate
    is_valid = validate_image(path)
    if not is_valid:
        return {
            **state,
            "valid_image": False,
            "error": "Invalid image"
        }

    # hash
    image_hash = generate_hash(path)

    state["image_hash"] = image_hash

    if image_hash in memory["cache"]:
        cached = memory["cache"][image_hash]

        return {
            **state,
            "valid_image": True,
            **cached,
            "description": cached["description"],
            "audio_path": cached["audio_path"]
        }

    return {
        **state,
        "valid_image": True
    }

Orchestration steps

1

Validate the image

validate_image(path) opens the file with Pillow and checks whether the format is one of JPEG, PNG, or WEBP. If the format is not supported or the file cannot be opened, the function returns False.On failure the orchestrator sets valid_image: False and error: "Invalid image" on the state and returns early. Downstream nodes in the graph check valid_image before executing.
2

Compute the content hash

generate_hash(path) normalises the image to raw RGB pixel bytes before hashing, so two copies of the same photo with different EXIF metadata or compression settings produce the same hash. The resulting SHA-256 hex digest is written to state["image_hash"].
3

Check the session cache

The orchestrator retrieves the SessionMemory for the current session_id and looks up the computed hash in its cache dictionary. Each entry in the cache was written by an earlier pipeline run for the same session.
4

Route the pipeline

If the hash is found in the cache the orchestrator merges the cached description and audio_path directly into state and returns — no visual analysis or speech synthesis is needed. If the hash is not in the cache the orchestrator returns valid_image: True without description or audio, and the graph proceeds to the visual analysis node.

State fields

Inputs

body.image_path
string
required
Absolute or relative path to the image file on disk. Passed through from the initial graph invocation.
body.session_id
string
required
Identifier for the current user session. Used to retrieve the correct SessionMemory bucket and its associated cache.

Outputs

valid_image
boolean
required
True when the image passed format validation. False when validation failed; all subsequent nodes should treat this as a terminal error.
error
string
Human-readable error message set only when valid_image is False. None on success.
image_hash
string
SHA-256 hex digest of the normalised image content. Written to state before the cache lookup so downstream nodes can use it when writing a new cache entry after pipeline completion.

Cache hit vs cache miss

On a cache miss the orchestrator returns the updated state with valid_image: True and no description or audio_path. The graph then proceeds through the visual analysis and speech nodes, which populate those fields. On a cache hit the orchestrator merges the previously computed result directly into state and the pipeline terminates after this node.
When a cache hit occurs, the orchestrator returns both description (the Croatian text description) and audio_path (the path to the previously generated .mp3 file) from the cached entry. The response to the caller is identical in shape to a fully processed run, so callers do not need to distinguish between the two paths.

Build docs developers (and LLMs) love