Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/dominikKos9/AgentForge/llms.txt

Use this file to discover all available pages before exploring further.

AgentForge separates its core agent logic from three utility modules that handle cross-cutting concerns: deciding whether an image is processable, producing a stable content fingerprint for caching, and wrapping the vision agent behind a swappable interface. These modules are imported directly by the orchestrator agent and the workflow graph, and none of them carry state.

Image validator

validate_image is the first check the orchestrator runs. It attempts to open the file with Pillow and returns True only if the decoded format is one of the three supported types.
from PIL import Image

def validate_image(path):
    try:
        img = Image.open(path)
        return img.format in ["JPEG", "PNG", "WEBP"]
    except:
        return False

Supported formats

FormatPillow format string
JPEG"JPEG"
PNG"PNG"
WebP"WEBP"
Any other format (GIF, BMP, TIFF, etc.) causes the function to return False. Exceptions raised during file open — such as a missing file or a corrupt header — are caught silently and also return False.
return value
boolean
required
True if the file is a readable JPEG, PNG, or WebP image. False for all other inputs.

Hash tool

generate_hash produces a SHA-256 fingerprint of the image’s visual content rather than its file bytes, so the same photograph stored with different metadata or compression settings hashes identically.
import hashlib
from PIL import Image
import io


def generate_hash(file_path):
    with open(file_path, "rb") as f:
        file_bytes = f.read()

    img = Image.open(io.BytesIO(file_bytes)).convert("RGB")
    normalized_bytes = img.tobytes()

    return hashlib.sha256(normalized_bytes).hexdigest()

Normalisation step

The function reads the raw file bytes, then re-decodes the image through Pillow and converts it to RGB mode. Calling img.tobytes() extracts the uncompressed pixel data — a flat byte sequence of width × height × 3 values. This intermediate representation strips all container-level metadata (EXIF, ICC profiles, compression artefacts) before hashing. The practical effect is that two files which display identically will produce the same hash even if one is a JPEG saved at quality 80 and the other is a lossless PNG exported from the same source. This is important for the session cache: without normalisation, re-uploading the same image in a different format would bypass the cache and trigger redundant processing.
return value
string
required
Lowercase hexadecimal SHA-256 digest (64 characters) of the normalised RGB pixel bytes.

MCP tool wrapper

describe_image_tool is a one-line function that forms a thin abstraction layer between the graph’s vision node and the underlying visual agent implementation.
def describe_image_tool(vision_agent, state):
    """
    MCP-style wrapper (tool abstraction layer)
    """

    return vision_agent(state)
The function accepts any callable that conforms to the (state) -> state signature expected by LangGraph nodes. In the current codebase workflow.py passes visual_analysis_agent as the vision_agent argument when constructing the vision_node.
This pattern follows the Model Context Protocol (MCP) convention of treating agent capabilities as named, swappable tools. Because vision_agent is injected rather than imported directly, you can substitute a different implementation — a mock for testing, a multimodal model with a different backend, or a remote service call — without modifying the graph definition. The calling convention in workflow.py remains unchanged.

Build docs developers (and LLMs) love