The speech agent is the final processing node in the AgentForge pipeline. It takes the Croatian text description produced by the visual analysis agent and synthesises it into an MP3 audio file using Microsoft Edge TTS. Because Edge TTS exposes an async API, the agent wraps the coroutine withDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/dominikKos9/AgentForge/llms.txt
Use this file to discover all available pages before exploring further.
asyncio.run() to keep the LangGraph node interface synchronous. The resulting file is written to a session-scoped directory under data/ and its path is stored in state so the caller can serve or stream it.
How the async bridge works
_tts(text, path) is an async coroutine. It constructs an edge_tts.Communicate instance with the target voice, then calls await communicate.save(path) which streams the audio from the Microsoft Edge TTS service and writes it to disk.
speech_agent is a plain synchronous function (as required by LangGraph node semantics). It calls asyncio.run(_tts(...)) to create a new event loop, run the coroutine to completion, and return. This pattern is safe as long as no outer event loop is already running in the same thread; if the agent is invoked from an async context, use await _tts(...) directly instead.
Voice and audio output
The voice used ishr-HR-GabrijelaNeural — a Croatian female neural voice provided by Microsoft Edge TTS. It produces natural-sounding speech suitable for accessibility use cases.
Audio is saved to data/{session_id}/output.mp3. The directory is created with os.makedirs(..., exist_ok=True) so no manual setup is required. If a previous run for the same session already wrote a file to this path, it is overwritten.
State fields
Inputs
Used to construct the output directory path
data/{session_id}/. Each session writes its audio file to an isolated subdirectory.The Croatian text to synthesise. This is the value written by the visual analysis agent in the previous pipeline step.
Outputs
Relative path to the generated MP3 file, always of the form
data/{session_id}/output.mp3. The caller can use this path to read, stream, or serve the audio file.Audio files accumulate in the
data/ directory, one subdirectory per session ID. There is no automatic cleanup: long-running deployments should implement a periodic job to remove old session directories or store the files in an object storage bucket and delete them after expiry.