AgentForge uses LangGraph’sDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/dominikKos9/AgentForge/llms.txt
Use this file to discover all available pages before exploring further.
StateGraph to define a deterministic pipeline where each node is a Python function that receives the current AgentState and returns an updated copy. The graph is compiled once at startup and invoked for every image processing request, giving you a reusable, inspectable execution graph with built-in conditional branching.
Complete workflow definition
The entire graph is defined inworkflow.py. It registers three nodes, sets up one unconditional starting edge, one conditional branch from the orchestrator, and two unconditional edges completing the happy path.
workflow.py
Graph nodes
orchestrator — runs orchestrator_agent. This node validates the image file, generates its SHA-256 hash, and checks the session cache. It sets valid_image to True or False and may also populate description and audio_path from a cache hit, in which case the vision and speech nodes are skipped entirely because the route function sends execution to END.
vision — runs vision_node, which delegates to describe_image_tool(visual_analysis_agent, state). The MCP-style wrapper calls the BLIP captioning model and then the Groq LLM to produce a Croatian text description. The result is written to state["description"].
speech — runs speech_agent. It reads state["description"], calls Edge TTS with the hr-HR-GabrijelaNeural voice, and writes the MP3 output path to state["audio_path"].
Conditional routing
After the orchestrator node completes, LangGraph evaluates theroute function to decide the next node.
workflow.py
"vision" or the END sentinel. LangGraph maps these return values to edges registered with add_conditional_edges. If valid_image has any value other than the boolean False — including None or True — the graph proceeds to "vision".
Edge map
| From | Condition | To |
|---|---|---|
START | — | orchestrator |
orchestrator | valid_image is False | END |
orchestrator | otherwise | vision |
vision | — | speech |
speech | — | END |
Step-by-step execution
Graph receives initial state
workflow.invoke(state) is called from main.py with a populated AgentState dictionary containing image_path, session_id, detailed, and default values for all optional fields. LangGraph passes this state to the first node.Orchestrator node runs
orchestrator_agent validates the image, computes its hash, and queries the session cache. It returns an updated state with valid_image set to True or False. On a cache hit it also sets description and audio_path.Route function decides the next node
LangGraph calls
route(state) with the state returned by the orchestrator. If valid_image is False, execution jumps directly to END and the final state is returned to the caller with the error field populated. Otherwise, the graph continues to the vision node.Vision node generates a description
vision_node calls describe_image_tool(visual_analysis_agent, state). The agent runs BLIP to caption the image in English, then calls the Groq LLM to produce a Croatian description according to the detailed flag. The result is merged into state as description.