This guide walks you through setting up AgentForge on your machine and generating your first AI-powered image description. By the end, you will have the Streamlit app running locally, understand how to interact with the UI, and know how to call the coreDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/dominikKos9/AgentForge/llms.txt
Use this file to discover all available pages before exploring further.
run_agentforge function directly from Python.
AgentForge requires a Groq API key to power its language model. Create a free account at console.groq.com to obtain one before you begin.
Set up AgentForge
Open the project
Clone or download the AgentForge repository, then open a terminal in the project root directory — the folder that contains
requirements.txt and the AgentForge/ subdirectory.Install dependencies
Install all required packages with pip:This installs
langchain, langgraph, streamlit, groq, pillow, edge-tts, transformers, torch, pydub, and python-dotenv.Configure the .env file
Open the Replace
.env file located in the project directory and add your Groq API key:gsk_xxxxxxxxxxxxxxxxx with your actual key from the Groq console.Use the UI
The AgentForge interface is a single-page Streamlit app titled Glasovni opisivač za slijepe i slabovidne osobe (Voice describer for the blind and visually impaired).- Upload an image — click the Upload button and select a
.png,.jpg, or.jpegfile. - Toggle detailed mode — check the Detaljan opis checkbox if you want a longer, more detailed description. Leave it unchecked for a concise summary.
- Generate a description — click the Generiraj opis button to run the agent pipeline.
- A text description displayed on screen.
- An audio file you can play directly in the browser.
Call run_agentforge directly
You can invoke the core pipeline from Python without the Streamlit UI. The function signature from backend/main.py is:
| Parameter | Type | Default | Description |
|---|---|---|---|
image_path | str | — | Path to the image file on disk |
session_id | str | "default" | Unique identifier for the session; used to scope history |
detailed | bool | False | Set to True to request a detailed description |
description, audio_path, valid_image, and error keys.