LangGraph workflow: nodes, edges, and conditional routing

AgentForge uses LangGraph’s StateGraph to define a deterministic pipeline where each node is a Python function that receives the current AgentState and returns an updated copy. The graph is compiled once at startup and invoked for every image processing request, giving you a reusable, inspectable execution graph with built-in conditional branching.

Complete workflow definition

The entire graph is defined in workflow.py. It registers three nodes, sets up one unconditional starting edge, one conditional branch from the orchestrator, and two unconditional edges completing the happy path.

workflow.py

from langgraph.graph import StateGraph, START, END
from backend.graph.state import AgentState

from backend.agents.orchestrator_agent import orchestrator_agent
from backend.agents.visual_agent import visual_analysis_agent
from backend.agents.speech_agent import speech_agent
from backend.tools.mcp_tools import describe_image_tool

builder = StateGraph(AgentState)

builder.add_node("orchestrator", orchestrator_agent)

def vision_node(state):
    return describe_image_tool(visual_analysis_agent, state)

builder.add_node("vision", vision_node)
builder.add_node("speech", speech_agent)

builder.add_edge(START, "orchestrator")

def route(state):
    if state.get("valid_image") is False:
        return END
    return "vision"

builder.add_conditional_edges("orchestrator", route)

builder.add_edge("vision", "speech")
builder.add_edge("speech", END)

workflow = builder.compile()

Graph nodes

orchestrator — runs orchestrator_agent. This node validates the image file, generates its SHA-256 hash, and checks the session cache. It sets valid_image to True or False and may also populate description and audio_path from a cache hit, in which case the vision and speech nodes are skipped entirely because the route function sends execution to END. vision — runs vision_node, which delegates to describe_image_tool(visual_analysis_agent, state). The MCP-style wrapper calls the BLIP captioning model and then the Groq LLM to produce a Croatian text description. The result is written to state["description"]. speech — runs speech_agent. It reads state["description"], calls Edge TTS with the hr-HR-GabrijelaNeural voice, and writes the MP3 output path to state["audio_path"].

Conditional routing

After the orchestrator node completes, LangGraph evaluates the route function to decide the next node.

workflow.py

def route(state):
    if state.get("valid_image") is False:
        return END
    return "vision"

The function returns either the string "vision" or the END sentinel. LangGraph maps these return values to edges registered with add_conditional_edges. If valid_image has any value other than the boolean False — including None or True — the graph proceeds to "vision".

Edge map

From	Condition	To
`START`	—	`orchestrator`
`orchestrator`	`valid_image is False`	`END`
`orchestrator`	otherwise	`vision`
`vision`	—	`speech`
`speech`	—	`END`

Step-by-step execution

Graph receives initial state

workflow.invoke(state) is called from main.py with a populated AgentState dictionary containing image_path, session_id, detailed, and default values for all optional fields. LangGraph passes this state to the first node.

Orchestrator node runs

orchestrator_agent validates the image, computes its hash, and queries the session cache. It returns an updated state with valid_image set to True or False. On a cache hit it also sets description and audio_path.

Route function decides the next node

LangGraph calls route(state) with the state returned by the orchestrator. If valid_image is False, execution jumps directly to END and the final state is returned to the caller with the error field populated. Otherwise, the graph continues to the vision node.

Vision node generates a description

vision_node calls describe_image_tool(visual_analysis_agent, state). The agent runs BLIP to caption the image in English, then calls the Groq LLM to produce a Croatian description according to the detailed flag. The result is merged into state as description.

Speech node produces audio

speech_agent reads state["description"], synthesises speech with Edge TTS, saves the MP3 to disk, and returns the updated state with audio_path pointing to the file. LangGraph then routes to END and returns the completed state to the caller.

When the image fails validation, valid_image is set to False and error is set to "Invalid image". The route function immediately returns END, so neither the vision nor the speech node runs. The returned state will have empty description and audio_path fields. Always check state["valid_image"] before attempting to read these fields in calling code.

Get Started

Architecture

Agents & Tools

Configuration

LangGraph workflow: nodes, edges, and conditional routing

Complete workflow definition

Graph nodes

Conditional routing

Edge map

Step-by-step execution

Build docs developers (and LLMs) love

Get Started

Architecture

Agents & Tools

Configuration

Documentation Index

​Complete workflow definition

​Graph nodes

​Conditional routing

​Edge map

​Step-by-step execution

Build docs developers (and LLMs) love

Complete workflow definition

Graph nodes

Conditional routing

Edge map

Step-by-step execution