AgentForge separates its core agent logic from three utility modules that handle cross-cutting concerns: deciding whether an image is processable, producing a stable content fingerprint for caching, and wrapping the vision agent behind a swappable interface. These modules are imported directly by the orchestrator agent and the workflow graph, and none of them carry state.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/dominikKos9/AgentForge/llms.txt
Use this file to discover all available pages before exploring further.
Image validator
validate_image is the first check the orchestrator runs. It attempts to open the file with Pillow and returns True only if the decoded format is one of the three supported types.
Supported formats
| Format | Pillow format string |
|---|---|
| JPEG | "JPEG" |
| PNG | "PNG" |
| WebP | "WEBP" |
False. Exceptions raised during file open — such as a missing file or a corrupt header — are caught silently and also return False.
True if the file is a readable JPEG, PNG, or WebP image. False for all other inputs.Hash tool
generate_hash produces a SHA-256 fingerprint of the image’s visual content rather than its file bytes, so the same photograph stored with different metadata or compression settings hashes identically.
Normalisation step
The function reads the raw file bytes, then re-decodes the image through Pillow and converts it toRGB mode. Calling img.tobytes() extracts the uncompressed pixel data — a flat byte sequence of width × height × 3 values. This intermediate representation strips all container-level metadata (EXIF, ICC profiles, compression artefacts) before hashing.
The practical effect is that two files which display identically will produce the same hash even if one is a JPEG saved at quality 80 and the other is a lossless PNG exported from the same source. This is important for the session cache: without normalisation, re-uploading the same image in a different format would bypass the cache and trigger redundant processing.
Lowercase hexadecimal SHA-256 digest (64 characters) of the normalised RGB pixel bytes.
MCP tool wrapper
describe_image_tool is a one-line function that forms a thin abstraction layer between the graph’s vision node and the underlying visual agent implementation.
(state) -> state signature expected by LangGraph nodes. In the current codebase workflow.py passes visual_analysis_agent as the vision_agent argument when constructing the vision_node.
This pattern follows the Model Context Protocol (MCP) convention of treating agent capabilities as named, swappable tools. Because
vision_agent is injected rather than imported directly, you can substitute a different implementation — a mock for testing, a multimodal model with a different backend, or a remote service call — without modifying the graph definition. The calling convention in workflow.py remains unchanged.